Real-Time Hair Rendering using Sequential Adversarial Networks
Conference on Computer Vision, ECCV 2018
Lingyu Wei1,2    Liwen Hu1,2    Vladimir Kim3    Ersin Yumer4    Hao Li1,2   
Pinscreen1    University of Southern California2    Adobe Research3    Argo AI4   
Abstract

We present an adversarial network for rendering photorealistic hair as an alternative to conventional computer graphics pipelines. Our deep learning approach does not require low-level parameter tuning nor ad-hoc asset design. Our method simply takes a strand-based 3D hair model as input and provides intuitive user-control for color and lighting through reference images. To handle the diversity of hairstyles and its appearance complexity, we disentangle hair structure, color, and illumination properties using a sequential GAN architecture and a semisupervised training approach. We also introduce an intermediate edge activation map to orientation field conversion step to ensure a successful CG-to-photoreal transition, while preserving the hair structures of the original input data. As we only require a feed-forward pass through the network, our rendering performs in real-time. We demonstrate the synthesis of photorealistic hair images on a wide range of intricate hairstyles and compare our technique with state-of-the-art hair rendering methods.


We propose a real-time hair rendering method. Given a reference image, we can render a 3D hair model with the referenced color and lighting in real-time. Faces in this paper are obfuscated to avoid copyright infringement.
Data Preparation

Since we are only focusing on hair synthesis, we mask non-hair regions to ignore their effect, and set their pixel values to black. To avoid manual mask annota- tion, we train Pyramid Scene Parsing Network to perform automatic hair segmentation. We annotate hair masks for 3000 random images from CelebA-HQ dataset Progressive growing of GANs for improved quality, stability, and variation., and train our network on this data. We use the network to compute masks for the entire 30,000 images in CelebA-HQ dataset, and manually remove images with wrong segmentation, yielding about 27,000 segmented hair images.z We randomly sampled 5,000 images from this dataset to use as our training data. We apply same deterministic filters Fi on each image in the training data, to obtain the corresponding gray image, orientation maps, and edge activation maps.

Discussion

We presented the first deep learning approach for rendering photorealistic hair, which performs in real-time. We have shown that our sequential GAN architec- ture and semi-supervised training approach can effectively disentangle strand- level structures, appearance, and illumination properties from the highly com- plex and diverse range of hairstyles. In particular, our evaluations show that without our sequential architecture, the lighting parameter would dominate over color, and color specification would no longer be effective. Moreover, our trained latent space is smooth, which allows us to interpolate continuously between arbitrary color and lighting samples. Our evaluations also suggests that there are no significant differences between a vanilla conditional GAN and a state-of-the-art network such as bicycleGAN, which uses additional smoothness constraints in the training. Our experiments further indicate that a direct conversion from a CG rendering to a photoreal image using existing adversarial networks would lead to significant artifacts or unwanted hairstyles. Our intermediate conversion step from edge activation to orientation map has proven to be an effective way for semi-supervised training and transitioning from synthetic input to photoreal output while ensuring that the intended hairstyle structure is preserved.

Limitations and Future Work

As shown in the video demo, the hair rendering is not entirely temporally coherent when rotating the view. While the per frame predictions are reasonable and most strand structure are consistent be- tween frames, there are still visible ickering artifacts. We believe that temporal consistency could be trained with augmentations with 3D rotations, or video training data.

We believe that our sequential GAN architecture for parameter separation and our intermediate representation for CG-to-photoreal conversion could be generalized for the rendering of other objects and scenes beyond hair. Our method presents an interesting alternative and complementary solution for many applications, such as hair modeling with interactive visual feedback, photo manipulation, and image-based 3D avatar rendering.

While we do not provide the same level of fine-grained control as conventional graphics pipelines, our effcient approach is significantly simpler and generates more realistic output without any tedious fine tuning. Nevertheless, we would like to explore the ability to specify precise lighting configurations and advanced shading parameters for a seamless integration of our hair rendering into vir- tual environments and game engines. We believe that additional training with controlled simulations and captured hair data would be necessary.

Like other GAN techniques, our results are also not fully indistinguishable from real ones for a trained eye and an extended period of observation, but we are confident that our proposed approach would benefit from future advancements in GANs.

Downloads

CosimoWei_ECCV2018_Git.pdf (19MB)
Footer With Address And Phones