Photorealistic Facial Texture Inference Using Deep Neural Networks (bibtex)
by Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, Hao Li
Abstract:
We present a data-driven inference method that can synthesize a photorealistic texture map of a complete 3D face model given a partial 2D view of a person in the wild. After an initial estimation of shape and low-frequency albedo, we compute a high-frequency partial texture map, without the shading component, of the visible face area. To extract the fine appearance details from this incomplete input, we introduce a multi-scale detail analysis technique based on midlayer feature correlations extracted from a deep convolutional neural network. We demonstrate that fitting a convex combination of feature correlations from a high-resolution face database can yield a semantically plausible facial detail description of the entire face. A complete and photorealistic texture map can then be synthesized by iteratively optimizing for the reconstructed feature correlations. Using these high-resolution textures and a commercial rendering framework, we can produce high-fidelity 3D renderings that are visually comparable to those obtained with state-of-theart multi-view face capture systems. We demonstrate successful face reconstructions from a wide range of low resolution input images, including those of historical figures. In addition to extensive evaluations, we validate the realism of our results using a crowdsourced user study.
Reference:
Photorealistic Facial Texture Inference Using Deep Neural Networks (Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, Hao Li), In arXiv preprint arXiv:1612.00523, 2016.
Bibtex Entry:
@article{saito_photorealistic_2016,
	title = {Photorealistic {Facial} {Texture} {Inference} {Using} {Deep} {Neural} {Networks}},
	url = {https://arxiv.org/abs/1612.00523},
	abstract = {We present a data-driven inference method that can synthesize a photorealistic texture map of a complete 3D face model given a partial 2D view of a person in the wild. After an initial estimation of shape and low-frequency albedo, we compute a high-frequency partial texture map, without the shading component, of the visible face area. To extract the fine appearance details from this incomplete input, we introduce a multi-scale detail analysis technique based on midlayer feature correlations extracted from a deep convolutional neural network. We demonstrate that fitting a convex combination of feature correlations from a high-resolution face database can yield a semantically plausible facial detail description of the entire face. A complete and photorealistic texture map can then be synthesized by iteratively optimizing for the reconstructed feature correlations. Using these high-resolution textures and a commercial rendering framework, we can produce high-fidelity 3D renderings that are visually comparable to those obtained with state-of-theart multi-view face capture systems. We demonstrate successful face reconstructions from a wide range of low resolution input images, including those of historical figures. In addition to extensive evaluations, we validate the realism of our results using a crowdsourced user study.},
	journal = {arXiv preprint arXiv:1612.00523},
	author = {Saito, Shunsuke and Wei, Lingyu and Hu, Liwen and Nagano, Koki and Li, Hao},
	month = dec,
	year = {2016},
	keywords = {Graphics, UARC}
}
Powered by bibtexbrowser