Driving High-Resolution Facial Scans with Video Performance Capture (bibtex)

by Fyffe, Graham, Jones, Andrew, Alexander, Oleg, Ichikari, Ryosuke and Debevec, Paul

Abstract:

We present a process for rendering a realistic facial performance with control of viewpoint and illumination. The performance is based on one or more high-quality geometry and reflectance scans of an actor in static poses, driven by one or more video streams of a performance. We compute optical flow correspondences between neighboring video frames, and a sparse set of correspondences between static scans and video frames. The latter are made possible by leveraging the relightability of the static 3D scans to match the viewpoint(s) and appearance of the actor in videos taken in arbitrary environments. As optical flow tends to compute proper correspondence for some areas but not others, we also compute a smoothed, per-pixel confidence map for every computed flow, based on normalized cross-correlation. These flows and their confidences yield a set of weighted triangulation constraints among the static poses and the frames of a performance. Given a single artist-prepared face mesh for one static pose, we optimally combine the weighted triangulation constraints, along with a shape regularization term, into a consistent 3D geometry solution over the entire performance that is drift free by construction. In contrast to previous work, even partial correspondences contribute to drift minimization, for example, where a successful match is found in the eye region but not the mouth. Our shape regularization employs a differential shape term based on a spatially varying blend of the differential shapes of the static poses and neighboring dynamic poses, weighted by the associated flow confidences. These weights also permit dynamic reflectance maps to be produced for the performance by blending the static scan maps. Finally, as the geometry and maps are represented on a consistent artist-friendly mesh, we render the resulting high-quality animated face geometry and animated reflectance maps using standard rendering tools.

View PDF

Reference:

Driving High-Resolution Facial Scans with Video Performance Capture (Fyffe, Graham, Jones, Andrew, Alexander, Oleg, Ichikari, Ryosuke and Debevec, Paul), In ACM Transactions on Graphics (TOG), volume 34, 2014.

Bibtex Entry:

@article{fyffe_driving_2014,
	title = {Driving {High}-{Resolution} {Facial} {Scans} with {Video} {Performance} {Capture}},
	volume = {34},
	url = {http://ict.usc.edu/pubs/Driving%20High-Resolution%20Facial%20Scans%20with%20Video%20Performance%20Capture.pdf},
	abstract = {We present a process for rendering a realistic facial performance with control of viewpoint and illumination. The performance is based on one or more high-quality geometry and reflectance scans of an actor in static poses, driven by one or more video streams of a performance. We compute optical flow correspondences between neighboring video frames, and a sparse set of correspondences between static scans and video frames. The latter are made possible by leveraging the relightability of the static 3D scans to match the viewpoint(s) and appearance of the actor in videos taken in arbitrary environments. As optical flow tends to compute proper correspondence for some areas but not others, we also compute a smoothed, per-pixel confidence map for every computed flow, based on normalized cross-correlation. These flows and their confidences yield a set of weighted triangulation constraints among the static poses and the frames of a performance. Given a single artist-prepared face mesh for one static pose, we optimally combine the weighted triangulation constraints, along with a shape regularization term, into a consistent 3D geometry solution over the entire performance that is drift free by construction. In contrast to previous work, even partial correspondences contribute to drift minimization, for example, where a successful match is found in the eye region but not the mouth. Our shape regularization employs a differential shape term based on a spatially varying blend of the differential shapes of the static poses and neighboring dynamic poses, weighted by the associated flow confidences. These weights also permit dynamic reflectance maps to be produced for the performance by blending the static scan maps. Finally, as the geometry and maps are represented on a consistent artist-friendly mesh, we render the resulting high-quality animated face geometry and animated reflectance maps using standard rendering tools.},
	number = {1},
	journal = {ACM Transactions on Graphics (TOG)},
	author = {Fyffe, Graham and Jones, Andrew and Alexander, Oleg and Ichikari, Ryosuke and Debevec, Paul},
	month = nov,
	year = {2014},
	keywords = {Graphics, UARC},
	pages = {1-- 13}
}