The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas.
the classical 2D setting, and showing how to adapt popular learning schemes in order to
deal with deformable shapes.These course notes will assume no particular background, beyond
some basic working knowledge that is a common denominator for people in the field of computer
graphics. All the necessary notions and mathematical foundations will be described. The course
is targeted to graduate students, practitioners, and researchers interested in shape analysis,
synthesis, matching, retrieval, and big data.
Deep learning methods have literally shaken many realms in the academia and industry in the
past few years. Technology giants like Apple, Google and Facebook have been aggressively
hunting for experts in the field and acquiring promising deep learning startup companies
for unprecedented amounts, all of which are indicative of the enormous bets the industry is
currently placing on this technology. Nowadays, deep learning methods are already widely
used in commercial applications, including Siri speech recognition in Apple iPhone, Google
text translation, and Mobileye vision-based technology for autonomously driving cars.
Dealing with signals such as speech, images, or video on 1D-, 2D- and 3D Euclidean domains,
respectively, has been the main focus of research in deep learning for the past decades.
However, in the recent years, more and more fields have to deal with data residing on
non-Euclidean geometric domains, which we call here “geometric data” for brevity.
An alternative definition of an intrinsic equivalent of convolution is in the spatial domain. A classical convolution can be thought of as a template matching with filter, operating as a sliding window: e.g. in an image, one extracts a patch of pixels, correlates it with atemplate, and moves the window to the next position. In the non-Euclidean setting, the lack of shift-invariance makes the patch extraction operation position-dependent. The patch operator Dj(x) acting on the point x 2 X can be defined as a re-weighting of the input signal f by means of some weighting kernels spatially localized around x, i.e.
The intrinsic convolution can be defined as
where denotes the filter coefficients applied on the patch extracted at each point. Different spatial-domain intrinsic convolutional layers amounts for a different definition of the patch operator D. In the following we will see two examples.