## Detailed description

This document provides additional detailed information and guide lines for your semestral work. Note that you are not limited to this document and you can follow your own steps.

Recomended scenario:

• point correspondences
Extraction and searching for corresponding image points in different images is complicated problem itself. In your work you should define point correspondences manualy by mouse.
• geometry
From epipolar geometry to projective reconstruction
start with epipolar geometry - expressed by fundamental matrix F. F constraints two corresponding image points x' and x with equation x'TFx = 0 (i).

Rearange epipolar constraint (i) to system of linear equations Af = 0 where A is 9xN matrix for N corresponding pairs and f is fundamental matrix written in 9x1 vector. Find f using SVD decomposition on A.
F matrix has rank 2. Force it to be rank 2 using SVD by zeroing smallest singular value.

Calculate canonical camera pair from fundamental matrix.
This can be done by setting first camera to P1 = [I | 0] where I is 3x3 identity matrix.
Second camera is then P2 = [ -[e']xF | e'] where e' is right epipole (e'F = 0). [v]x denotes skew symetric cross product matrix for vector v = (x y z):
[v]x =  [ 0 -z y ] [ z 0 -x ] [ -y x 0 ]

Warning! normalization required
As pointed in RI Hartley, "In Defence of the 8-point Algorithm", ICCV 1995, pp. 1064-1070 not normalized data does not lead to precise result. In order to get good result for linear algorithm for solving F-matrix (described above) you have to perform data normalization as follows:

• rescale the camera points to zero mean and root 2 variance: xnormalized=H1x and x'normalized=H2x'
• calculate least squares fit Fundamental Matrix: using normalized coordinates, SVD and setting last singular value to 0
• remove the normalization: H2T F H1

For detailed information see Hartley, Zisserman: Multiple View Geometry in Computer Vision (online)
• calibration
Note that if you had at least two cameras then you would calculate 3D position of every corresponding pair of 2D points. For this purpose rearrange equation xi = PiX where the only unknown is X.

Simple observation : a xij = PiXj =  = Pi I Xj =  = PiHH-1Xj
It is easy to see now that there are infinitely many projective reconstructions with respect to 4x4 homography H.

Calibration means to put reconstructed projective space into correct metric or euclidean space. This is done be claiming that camera has real physical properties - like focal length can be varying but CCD chip is not skewed and principal point is constant...
Again there are many possible ways how to do it. It is possible to calculate calibration directly from images (3 and more, according to camera model).

For your purpose it is sufficient to calibrate cameras from partial knowledge of observed space.

2 possibilities :
• Calibration from knowledge of part of your image.
If you are able to measure distances or angles between points you can simply calculate H. Len Xi and Xj be two points for which we have measured the distances in real space. We have constraint ||H*(Xi - Xj)|| = d. Similary if we know that two lines defined by points Xi, Xj, Xk, Xl are perpendicular we have constraint ( H*(Xi - Xj) , H*(Xk - Xl) )= 0 , where ( , ) denotes scalar product.
Calculate H and use it to align reconstructed space with real world.
• Calibration by independent object.
• Camera P can be decomposed to calibration 3x3 matrix K, rotation 3x3 matrix R and translation vector t such that P = K[R -Rt]. K is upper triangular matrix. Basic idea is : calculate calibration matrix K from pictures of known object. Then, without changing internal camera parameters (zoom...) move (R, t) to the desired place and take pictures of unknown object.
Such calibration can be divided into following steps:
• take pictures of known object,
• create projective reconstruction: cameras Pi and 3D points Xi,
• calculate 4x4 homography H such that a Xi-known = H Xi. For more the details see Marc Pollefeys online tutorial,
• transform cameras with H-1 and 3D structure with H,
• decompose cameras using RQ decomposition to obtain calibration matrix K,

• calculate homography H which will change calibration matrix of the other reconstruction to desired form (obtained from pictures of the known object) K.

Note: If camera calibration matrix was know, you would "undistort" measured 2D points - x'=K-1x. Epipolar geometry for calibrated data is described by essential matrix E with x'TEx = 0. Unlike F, E matix can be decomposed to rotation matrix and translation vector directly using SVD decomposition see Hartley, Zisserman : Multiple View Geometry in Computer Vision (online)
• result - measure distance between two specified points.