Contextual-based Image Inpainting: Infer, Match, and Translate

ECCV 2018

Yuhang Song¹ Chao Yang¹ Zhe Lin² Xiaofeng Liu³ Qin Huang¹ Hao Li^1,4,5 C.-C. Jay Kuo¹

University of Southern California¹ Adobe Research² Carnegie Mellon University³ Pinscreen⁴ USC Institute for Creative Technologies⁵

Abstract/Introduction

We study the task of image inpainting, which is to fill in the missing region of an incomplete image with plausible contents. To this end, we propose a learning-based approach to generate visually coherent completion given a high-resolution image with missing components. In order to overcome the difficulty to directly learn the distribution of highdimensional image data, we divide the task into inference and translation as two separate steps and model each step with a deep neural network. We also use simple heuristics to guide the propagation of local textures from the boundary to the hole. We show that, by using such techniques, inpainting reduces to the problem of learning two image-feature translation functions in much smaller space and hence easier to train. We evaluate our method on several public datasets and show that we generate results of better visual quality than previous state-of-the-art methods.

System Overview

Our system divides the image inpainting tasks into three steps:

Inference: We use an Image2Feature network to fill an incomplete image with coarse contents as inference and extract a feature map from the inpainted image.

Matching: We use patch-swap on the feature map to match the neural patches from the high-resolution boundary to the hole with coarse inference.

Translation: We use a Feature2Image network to translate the feature map to a complete image.

Conclusion

We propose a learning-based approach to synthesize missing contents in a highresolution image. Our model is able to inpaint an image with realistic and sharp contents in a feed-forward manner. We show that we can simplify training by breaking down the task into multiple stages, where the mapping function in each stage has smaller dimensionality. It is worth noting that our approach is a metaalgorithm and naturally we could explore a variety of network architectures and training techniques to improve the inference and the final result. We also expect that similar idea of multi-stage, multi-scale training could be used to directly synthesize high-resolution images from sampling.

Downloads

Contextual-based Image Inpainting: Infer, Match, and Translate (7.7MB)

Contextual-based Image Inpainting: Infer, Match, and Translate

Yuhang Song1 Chao Yang1 Zhe Lin2 Xiaofeng Liu3 Qin Huang1 Hao Li1,4,5 C.-C. Jay Kuo1

Yuhang Song¹ Chao Yang¹ Zhe Lin² Xiaofeng Liu³ Qin Huang¹ Hao Li^1,4,5 C.-C. Jay Kuo¹