One can note that the similarity measure presented in the previous section is only invariant to translation and offsets in the intensity values. If important rotation or scaling takes place the similarity measure is not appropriate. The same is true when the lighting conditions differ too much. Therefore the cross-correlation based approach can only be used between images for which the camera poses are not too far apart.
In this section a more advanced matching procedure is presented that can deal with much larger changes in viewpoint and illumination [208]. As should be clear from the previous discussion, it is important that pixels corresponding to the same part of the surface are used for comparison during matching. By assuming that the surface is locally planar and that there is no perspective distortion, local transformations of pixels from one image to the other are described by 2D affine transformations. Such a transformation is defined by three points. At this level we only have one, i.e. the corner under consideration, and therefore need two more. The idea is to go look for them along edges which pass through the point of interest. It is proposed to only use corners having two edges connected to them, as in figure 4.4.
![]() |
| (D4) |
| (D5) |
Now that we have a method at hand for the automatic extraction of local, affinely invariant image regions, these can easily be described in an affinely invariant way using moment invariants [212]. As in the region finding steps, invariance both under affine geometric changes and linear photometric changes, with different offsets and different scale factors for each of the three color bands, is considered.
For each region, a feature vector of moment invariance is composed. These can be compared quite efficiently with the invariant vectors computed for other regions, using a hashing-technique. It can be interesting to take the region type (curved or straight edges? Extrema of which function?) into account as well. Once the corresponding regions have been identified, the cross-correlation between them (after normalization to a square reference region) is computed as a final check to reject false matches.