The power of learning representations.

Andrea Vedaldi (University of Oxford, UK)

In this talk I will discuss two approaches to learning representations in computer vision. Firstly, I will show how large-scale metric learning can be used to simultaneously compare and compress visual descriptors, with applications to recognising faces and matching interest points. Secondly, I will discuss deep learning, where multiple layers of representations are learned directly from data, starting from the raw image pixels. In particular, I will present a recent evaluation of deep learning methods, demonstrating how these result in excellent low-dimensional and general-purpose image encodings.

Beyond improving standard machine vision tasks, advances in machine learning are expected to open a path towards significantly more detailed and comprehensive image understanding. In the final part of the talk, I will introduce two datasets to study detailed image understanding tasks, the Objects in Detail dataset and Describable Texture dataset, and use those to illustrate important machine vision challenges.