Large scale clustering and approximate nearest neighbor search

Yannis Avrithis
(Inria Rennes, France)

Abstract:

In this talk we discuss the role in approximate nearest neighbor search in large scale clustering, and applications in vision. We begin with a number of binary codes and product quantization extensions, highlighting the relation to nonlinear dimensionality reduction. We touch upon the deep connection between the two problems, involving distance maps in arbitrary dimensions.

We then revisit recent advances in approximate k-means variants, and present a new one (IQ-means) that borrows their best ingredients. Combined with powerful deep learned representations, this method achieves clustering of a 100 million image collection on a single machine in less than one hour, while dynamically determining the number of clusters.

Code for IQ-means is available online at https://github.com/iavr/iqm