Ranking on Manifolds for Image Retrieval

Ahmet Iscen (CMP Prague, Czech Republic)


Despite the success of deep learning on representing images for particular object retrieval, recent studies show that the learned representations still lie on manifolds in a high dimensional space. This makes the Euclidean nearest neighbor search biased for this task. Exploring the manifolds online remains expensive even if a nearest neighbor graph has been computed offline.

This work focuses on diffusion, a mechanism that captures the image manifold in the feature space. We first perform diffusion through a sparse linear system solver, yielding practical query times well below one second. Then, we introduce an explicit embedding reducing manifold search to Euclidean search followed by dot product similarity search. This is equivalent to linear graph filtering of a sparse signal in the frequency domain. To speed up the online search, an approximate Fourier basis of the graph is computed offline in a scalable way.

Experimentally, we observe a significant boost in performance of image retrieval with compact CNN descriptors on standard benchmarks, especially when the query object covers only a small part of the image. Small objects have been a common failure case of CNN-based retrieval.