Large Scale Image Clustering and Object Discovery
Ondrej Chum
(Center for Machine Perception, CTU Prague, Czech Republic)
Abstract:
We propose a randomized data mining method that finds clusters of
spatially overlapping images. Our approach avoids burut-force
computation of similarity between all image pairs. The core of the
method relies on the a randomized algorithm for fast proposal of image
clusters, the so-called cluster seeds. Efficient method for cluster
seed discovery are discussed: min-Hash, weighted min-Hash and
geometric min-Hash.
The seeds are then used as visual queries to obtain clusters which
are formed as transitive closures of sets of partially overlapping
images that include the seed. We show that the probability of
finding a seed for an image cluster rapidly increases with the
size of the cluster.
Related Papers:
-
Chum, O. and Matas. J. :
Large Scale Discovery of Spatilly Related Images, PAMI,
February 2010
-
Chum, O., Philbin, J., Isard, M. and Zisserman, A.:
Scalable Near Identical Image and Shot Detection, CIVR
2007
-
Chum, O. , Philbin, J. and Zisserman, A. :
Near Duplicate Image Detection: min-Hash and tf-idf Weighting,
BMVC 2008
-
Chum, O., Perdoch, M., and Matas, J. :
Geometric min-Hashing: Finding a (Thick) Needle in a
Haystack, CVPR 2009