This document provides additional detailed information and guide lines for your semestral work.
Note that you are not limited to this document and you can follow your own steps.
- point correspondences
Extraction and searching for corresponding image points in different images is
complicated problem itself. In your work you should define point correspondences
manualy by mouse.
From epipolar geometry to projective reconstruction
start with epipolar geometry - expressed by fundamental matrix F.
F constraints two corresponding image points x' and x with
equation x'TFx = 0 (i).
Rearange epipolar constraint (i) to system of linear equations Af = 0 where
A is 9xN matrix for N corresponding pairs and f is fundamental matrix
written in 9x1 vector. Find f using SVD decomposition on A.
F matrix has rank 2. Force it to be rank 2 using SVD by zeroing smallest singular value.
Calculate canonical camera pair from fundamental matrix.
This can be done by setting first camera to P1 = [I | 0] where I is 3x3 identity matrix.
Second camera is then P2 = [ -[e']xF | e']
where e' is right epipole (e'F = 0). [v]x denotes skew symetric
cross product matrix for vector v = (x y z):
Warning! normalization required
As pointed in RI Hartley, "In Defence of the 8-point Algorithm", ICCV 1995, pp. 1064-1070 not
normalized data does not lead to precise result. In order to get good result for linear algorithm
for solving F-matrix (described above) you have to perform data normalization as follows:
- rescale the camera points to zero mean and root 2 variance: xnormalized=H1x and x'normalized=H2x'
- calculate least squares fit Fundamental Matrix: using normalized coordinates, SVD and setting last singular value to 0
- remove the normalization: H2T F H1
For detailed information see Hartley, Zisserman: Multiple View Geometry in Computer Vision (online)
Note that if you had at least two cameras then you would calculate
3D position of every corresponding pair of 2D points. For this purpose rearrange equation xi = PiX
where the only unknown is X.
Simple observation : a xij = PiXj =
= Pi I Xj =
It is easy to see now that there are infinitely many projective reconstructions with respect to 4x4 homography H.
Calibration means to put reconstructed projective space into correct metric or euclidean space. This
is done be claiming that camera has real physical properties - like focal length can be varying but
CCD chip is not skewed and principal point is constant...
Again there are many possible ways how to do it. It is possible to calculate calibration directly
from images (3 and more, according to camera model).
For your purpose it is sufficient to calibrate cameras from partial knowledge of observed space.
2 possibilities :
- Calibration from knowledge of part of your image.
If you are able to measure distances or angles between points you can simply calculate H.
Len Xi and Xj be two points for which we have measured the
distances in real space.
We have constraint ||H*(Xi - Xj)|| = d. Similary if we know that
two lines defined by points Xi, Xj, Xk, Xl are perpendicular
we have constraint ( H*(Xi - Xj) , H*(Xk - Xl) )= 0
, where ( , ) denotes scalar product.
Calculate H and use it to align reconstructed space with real world.
- Calibration by independent object.
Camera P can be decomposed to calibration 3x3 matrix K, rotation 3x3 matrix R and translation vector t
such that P = K[R -Rt]. K is upper triangular matrix.
Basic idea is : calculate calibration matrix K from pictures of known object. Then, without changing
internal camera parameters (zoom...) move (R, t) to the desired place and take pictures of unknown object.
Such calibration can be divided into following steps:
- take pictures of known object,
- create projective reconstruction: cameras Pi and 3D points Xi,
- calculate 4x4 homography H such that a Xi-known = H Xi.
For more the details see Marc Pollefeys online tutorial,
- transform cameras with H-1 and 3D structure with H,
- decompose cameras using RQ decomposition to obtain calibration matrix K,
- calculate homography H which will change calibration matrix of the other reconstruction
to desired form (obtained from pictures of the known object) K.
Note: If camera calibration matrix was know, you would "undistort" measured 2D points - x'=K-1x.
Epipolar geometry for calibrated data is described by essential matrix E with x'TEx = 0.
Unlike F, E matix can be decomposed to rotation matrix and translation vector directly using SVD decomposition
see Hartley, Zisserman : Multiple View Geometry in Computer Vision (online)
- result - measure distance between two specified points.