Large Scale Image Clustering and Object Discovery

Jiri Matas
(Center for Machine Perception, CTU Prague, Czech Republic)

Abstract:

The latest advances in the large scale image retrieval are overviwed in this lecture.

An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features cooccurring with high probability and are called co-ocsets. Discovered dependencies are used to suppress the over-counting effect in the bag-of-features retrieval.

We present a novel similarity measure for bag-of-words type large scale image retrieval. The similarity function is learned in an unsupervised manner, requires no extra space over the standard bag-of-words method and is more discriminative than both L2-based soft assignment and Hamming embedding.

We propose three modifications to automatic query expansion: (i) a method capable of preventing query expansion failure caused by the presence of confusers was introduced, (ii) the spatial verification and re-ranking step was improved by incrementally building a statistical model of the query object and (iii) we show that relevant spatial context improves retrieval performance.

Related Publications