Geometric Deep Learning

Jonathan Masci Emanuele Rodolà Davide Boscaini Michael M. Bronstein Hao Li

USC Institute for Creative Technologies

Figure 1: To ensure smooth descriptors, defines a classification problem for multiple segmentations of the human body. Nearby points on the body are likely to be assigned the same label in at least one segmentation.

Abstract

The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas.

Introduction

the classical 2D setting, and showing how to adapt popular learning schemes in order to deal with deformable shapes.These course notes will assume no particular background, beyond some basic working knowledge that is a common denominator for people in the field of computer graphics. All the necessary notions and mathematical foundations will be described. The course is targeted to graduate students, practitioners, and researchers interested in shape analysis, synthesis, matching, retrieval, and big data.

Deep learning methods have literally shaken many realms in the academia and industry in the past few years. Technology giants like Apple, Google and Facebook have been aggressively hunting for experts in the field and acquiring promising deep learning startup companies for unprecedented amounts, all of which are indicative of the enormous bets the industry is currently placing on this technology. Nowadays, deep learning methods are already widely used in commercial applications, including Siri speech recognition in Apple iPhone, Google text translation, and Mobileye vision-based technology for autonomously driving cars.

Dealing with signals such as speech, images, or video on 1D-, 2D- and 3D Euclidean domains, respectively, has been the main focus of research in deep learning for the past decades. However, in the recent years, more and more fields have to deal with data residing on non-Euclidean geometric domains, which we call here “geometric data” for brevity.

Intrinsic CNN's in the Spatial Domain

Figure 2: Construction of local geodesic polar coordinates on a manifold. Left: examples of local geodesic patches, center and right: example of angular and radial weights, respectively (red denotes larger weights).

An alternative definition of an intrinsic equivalent of convolution is in the spatial domain. A classical convolution can be thought of as a template matching with filter, operating as a sliding window: e.g. in an image, one extracts a patch of pixels, correlates it with atemplate, and moves the window to the next position. In the non-Euclidean setting, the lack of shift-invariance makes the patch extraction operation position-dependent. The patch operator Dj(x) acting on the point x 2 X can be defined as a re-weighting of the input signal f by means of some weighting kernels spatially localized around x, i.e.

The intrinsic convolution can be defined as

where denotes the filter coefficients applied on the patch extracted at each point. Different spatial-domain intrinsic convolutional layers amounts for a different definition of the patch operator D. In the following we will see two examples.

Figure 3: Visualization of different heat kernels (red represent high values). Leftmost: example of an isotropic heat kernel. Remaining: examples of anisotropic heat kernels for different rotation angles θ and anisotropy coefficient α.

Downloads

Paper
GEOMETRIC DEEP LEARNING.pdf, (54.3MB)