Large scale clustering and approximate nearest neighbor search
(Inria Rennes, France)
In this talk we discuss the role in approximate nearest neighbor search in large
scale clustering, and applications in vision. We begin with a number of binary
codes and product quantization extensions, highlighting the relation to
nonlinear dimensionality reduction. We touch upon the deep connection between
the two problems, involving distance maps in arbitrary dimensions.
We then revisit recent advances in approximate k-means variants, and present a
new one (IQ-means) that borrows their best ingredients. Combined with powerful
deep learned representations, this method achieves clustering of a 100 million
image collection on a single machine in less than one hour, while dynamically
determining the number of clusters.
Code for IQ-means is available online at https://github.com/iavr/iqm