Efficient Sequential Correspondence Selection by Cosegmentation

Jan Cech (Center for Machine Perception, Prague, Czech Republic)

Many successful image retrieval, object recognition and wide baseline stereo methods exploit correspondences of local transformation-covariant regions.Most real-world visual recognition problems are large scale where correspondences between regions from a query (test) image and many database (training) images (multiple views of a many objects or scenes) are sought. To achieve acceptable response times, large problems require the time complexity of the region matching process be sublinear in the size of the database; memory footprint of the database representation becomes a concern too. The standard solution is to describe regions with a compact descriptor such as SIFT or some discretization of it (e.g. "visual words") and to store database image representations in a search tree.

A better estimate of correspondence quality (a prediction of it being correct) can be obtained by looking at both test and training image simultaneously, e.g. by attempting to expand the correspondence domains or to improve the precision of registration. Most approaches to simultaneous cosegmentation and registration focus on the problem of finding the largest corresponding domain and codomain.

Our objective is almost opposite: given acceptable false positive and false negative rates, design the fastest possible test for correctness of a correspondence, based on cosegmentation of regions of growing size. We formulate the problem as sequential decision making performing Wald's sequential probability ratio test. The test is based on simple statistics of a modified dense stereo matching algorithm which are projected on a single prominent discriminative direction by a linear SVM.

On challenging matching problems, we show that the selection of correspondences based on sequential co-segmentation is very efficient, runs near to real-time and significantly outperforms the standard correspondence process based on SIFT distance ratios, producing a higher number as well as higher percentage of correct correspondences.