Randomized Clustering Forests for Image Classification

Frederic Jurie (INRIA Grenoble, France)

This talk will be focused on 3 recent contributions to the problems of image classification and image search. Some of the most effective recent methods for content-based image classification work by extracting image descriptors, quantizing them according to a coding rule such as k-means vector quantization, accumulating histograms of the resulting visual word codes over the image, and classifying these with a conventional classifier such as an SVM. Large numbers of descriptors and large codebooks are required for good results and this becomes slow using k-means. First, a new paradigm for representing images will be presented. We will introduce Extremely Randomized Clustering Forests -- ensembles of randomly created clustering trees -- and show that they provide more accurate results, much faster training and testing and good resistance to background clutter. Second, an efficient image classification method will be described. It combines ERC-Forests and saliency maps very closely with the extraction of image information. It will be shown that this method, which requires to extract 20 times less information from images than the standard bag-of-word algorithm also provides more accurate results. in several state-of-the-art image classification tasks. Finally, we will show that the proposed ERC-Forests can also be used very successfully for learning distance between images. The similarity measure has been evaluated on four very different datasets and always outperforms the state-of-the-art competitive approaches.