Soft Rasterizer: Differentiable Rendering for Unsupervised Single-View Mesh Reconstruction
arXiv 2019
Shichen Liu1,2    Weikai Chen1    Tianye Li1,2    Hao Li1,2,3   
USC Institute for Creative Technologies1    University of Southern California2    Pinscreen3   

Rendering is the process of generating 2D images from 3D assets, simulated in a virtual environment, typically with a graphics pipeline. By inverting such renderer, one can think of a learning approach to predict a 3D shape from an input image. However, standard rendering pipelines involve a fundamental discretization step called rasterization, which prevents the rendering process to be differentiable, hence able to be learned. We present the first nonparametric and truly differentiable rasterizer based on silhouettes. Our method enables unsupervised learning for high-quality 3D mesh reconstruction from a single image. We call our framework “soft rasterizer” as it provides an accurate soft approximation of the standard rasterizer. The key idea is to fuse the probabilistic contributions of all mesh triangles with respect to the rendered pixels. When combined with a mesh generator in a deep neural network, our soft rasterizer is able to generate an approximated silhouette of the generated polygon mesh in the forward pass. The rendering loss is back-propagated to supervise the mesh generation without the need of 3D training data. Experimental results demonstrate that our approach significantly outperforms the state-of-the-art unsupervised techniques, both quantitatively and qualitatively. We also show that our soft rasterizer can achieve comparable results to the cutting-edge supervised learning method [49] and in various cases even better ones, especially for real-world data.

3D mesh reconstruction from a single image. From left to right, we show input image, ground truth, the results of our method (SoftRas), Neural Mesh Renderer and Pixel2mesh, all visualized from 2 different views. Along with the results, we also visualize scan-to-mesh distances measured from ground truth to reconstructed mesh.
Soft Rasterizer

The main obstacle that impedes the standard graphics renderer from being differentiable is the discrete sampling operation, which is also named rasterization, that converts a continuous vector graphics into a raster image. In particular, after projecting the mesh triangles onto the screen space, standard rasterization technique fills each pixel with the color from the nearest triangle which covers that pixel. However, the color intensity of an image is the result of complex interplay between a variety of factors, including the lighting condition, viewing direction, reflectance property and the intrinsic texture of the rendered object, most of which are entirely independent from the 3D shape of the target object. Though one can infer fine surface details from the shading cues, special care has to be taken to decompose shading from the reflectance layers. Therefore, leveraging color information for 3D geometry reconstruction may unnecessarily complicate the problem especially when the target object only consists of smooth surfaces.

Discussion and Future Work

In this paper, we have presented the first non-parametric differentiable rasterizer (SoftRas) that enables unsupervised learning for high-quality mesh reconstruction from a single image. We demonstrate that it is possible to properly approximate the forward pass of the discrete rasterization with a differentiable framework. While many previous works like N3MR seek to provide approximated gradient in the backward propagation but using standard rasterizer in the forward pass, we believe that the consistency between the forward and backward propagations is the key to achieve superior performance. In addition, we found that proper choice of regularizers plays an important role for producing visually appealing geometry. Experiments have shown that our unsupervised approach achieves comparable and in certain cases, even better results to state-of-the-art supervised solutions.


arXiv (8MB)