3D motion estimation through filtering and without correspondence

Kostas Daniilidis (U. of Pennsylvania, US)

Recovering the pose of a camera using only visual input has been a well studied problem in computer vision, robotics, and photogrammetry. The underlying assumption has been the existence of corrspondences of image features or the identification of environmental landmarks. Such approaches either in structure from motion or pose estimation or in simultaneous localization and mapping, have been vulnerable to outliers and to the presence of independent motions. We introduce a new framework, where a motion hypothesis can be computed without correspondences using the Radon transform. The Radon transform turns out to become a filter, or more precisely a convolution on groups. The domain of integration is the cross-product of two images and a direct convolution would have a prohibitive computational cost. Using the spherical Fourier transform we have been able to compute the strength of a motion hypothesis in frequency space. Several useful byproducts in image registration result from this framework. The talk will be concluded by applications both in localization and 3d modelling of environments.