This document provides additional detailed information and guide lines for your semestral work.
Note that you are not limited to this document and you can follow your own steps.

Recomended scenario:

**point correspondences**Extraction and searching for corresponding image points in different images is complicated problem itself. In your work you should define point correspondences manualy by mouse.

**geometry****From epipolar geometry to projective reconstruction**start with epipolar geometry - expressed by fundamental matrix

*F*.*F*constraints two corresponding image points*x'*and*x*with equation*x'*(i).^{T}Fx = 0

Rearange epipolar constraint (i) to system of linear equations*Af = 0*where*A*is*9xN*matrix for N corresponding pairs and*f*is fundamental matrix written in*9x1*vector. Find*f*using SVD decomposition on A.

*F*matrix has rank 2. Force it to be rank 2 using SVD by zeroing smallest singular value.

Calculate canonical camera pair from fundamental matrix.

This can be done by setting first camera to*P1 = [I | 0]*where*I*is 3x3 identity matrix.

Second camera is then*P2 = [ -[e']*where_{x}F | e']*e'*is right epipole (*e'F = 0*).*[v]*denotes skew symetric cross product matrix for vector_{x}*v = (x y z)*:

[v] _{x}=[ 0 -z y ] [ z 0 -x ] [ -y x 0 ]

Warning! normalization required

As pointed in*RI Hartley, "In Defence of the 8-point Algorithm", ICCV 1995, pp. 1064-1070*not normalized data does not lead to precise result. In order to get good result for linear algorithm for solving*F-matrix*(described above) you have to perform data normalization as follows:

- rescale the camera points to zero mean and root 2 variance:
*x*and_{normalized}=H_{1}x*x'*_{normalized}=H_{2}x' - calculate least squares fit Fundamental Matrix: using normalized coordinates, SVD and setting last singular value to 0
- remove the normalization:
*H*_{2}^{T}F H_{1}

*For detailed information see Hartley, Zisserman: Multiple View Geometry in Computer Vision (online)*- rescale the camera points to zero mean and root 2 variance:
**calibration**Note that if you had at least two cameras then you would calculate 3D position of every corresponding pair of 2D points. For this purpose rearrange equation

*x*where the only unknown is_{i}= P_{i}X*X*.

Simple observation : a*x*_{ij}= P_{i}X_{j}= = P_{i}I X_{j}= = P_{i}HH^{-1}X_{j}

It is easy to see now that there are infinitely many projective reconstructions with respect to 4x4 homography H.

Calibration means to put reconstructed projective space into correct metric or euclidean space. This is done be claiming that camera has real physical properties - like focal length can be varying but CCD chip is not skewed and principal point is constant...

Again there are many possible ways how to do it. It is possible to calculate calibration directly from images (3 and more, according to camera model).

For your purpose it is sufficient to calibrate cameras from partial knowledge of observed space.

2 possibilities :- Calibration from knowledge of part of your image.

If you are able to measure distances or angles between points you can simply calculate

*H*. Len*X*and_{i}*X*be two points for which we have measured the distances in real space. We have constraint_{j}*||H*(X*. Similary if we know that two lines defined by points_{i}- X_{j})|| = d*X*are perpendicular we have constraint_{i}, X_{j}, X_{k}, X_{l}*( H*(X*, where_{i}- X_{j}) , H*(X_{k}- X_{l}) )= 0*( , )*denotes scalar product.

Calculate*H*and use it to align reconstructed space with real world. - Calibration by independent object. Camera P can be decomposed to calibration 3x3 matrix K, rotation 3x3 matrix R and translation vector t such that P = K[R -Rt]. K is upper triangular matrix. Basic idea is : calculate calibration matrix K from pictures of known object. Then, without changing internal camera parameters (zoom...) move (R, t) to the desired place and take pictures of unknown object.
- take pictures of known object,
- create projective reconstruction: cameras
*P*and 3D points_{i}*X*,^{i} - calculate 4x4 homography
*H*such that a*X*. For more the details see Marc Pollefeys online tutorial,_{i-known}= H X_{i} - transform cameras with
*H*and 3D structure with^{-1}*H*, - decompose cameras using RQ decomposition to obtain calibration matrix K,

- calculate homography
*H*which will change calibration matrix of the other reconstruction to desired form (obtained from pictures of the known object)*K*.

__Note:__If camera calibration matrix was know, you would "undistort" measured 2D points -*x'=K*. Epipolar geometry for calibrated data is described by essential matrix^{-1}x*E*with*x'*. Unlike^{T}Ex = 0*F*,*E*matix can be decomposed to rotation matrix and translation vector directly using SVD decomposition*see Hartley, Zisserman : Multiple View Geometry in Computer Vision (online)*

Such calibration can be divided into following steps:

- Calibration from knowledge of part of your image.
**result**- measure distance between two specified points.