Large Scale Image Clustering and Object Discovery

Jiri Matas
(Center for Machine Perception, CTU Prague, Czech Republic)

Abstract:

The latest advances in the large scale image retrieval are overviwed in this lecture.

An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features cooccurring with high probability and are called co-ocsets. Discovered dependencies are used to suppress the over-counting effect in the bag-of-features retrieval.

We present a novel similarity measure for bag-of-words type large scale image retrieval. The similarity function is learned in an unsupervised manner, requires no extra space over the standard bag-of-words method and is more discriminative than both L2-based soft assignment and Hamming embedding.

We propose three modifications to automatic query expansion: (i) a method capable of preventing query expansion failure caused by the presence of confusers was introduced, (ii) the spatial verification and re-ranking step was improved by incrementally building a statistical model of the query object and (iii) we show that relevant spatial context improves retrieval performance.

Related Publications

Perdoch, M., Chum, O., and Matas, J. : Efficient Representation of Local Geometry for Large Scale Object Retrieval, CVPR 2009

Chum, O. and Matas, J. : Unsupervised Discovery of Co-occurrence in Sparse High Dimensional Data, CVPR 2010

Mikulik, A., Perdoch, M., Chum, O. and Matas, J. : Learning a Fine Vocabulary, ECCV 2010