Page 206 -
P. 206
4.1 Points and patches 185
Figure 4.3 Image pairs with extracted patches below. Notice how some patches can be localized or matched
with higher accuracy than others.
4.1.1 Feature detectors
How can we find image locations where we can reliably find correspondences with other
images, i.e., what are good features to track (Shi and Tomasi 1994; Triggs 2004)? Look again
at the image pair shown in Figure 4.3 and at the three sample patches to see how well they
might be matched or tracked. As you may notice, textureless patches are nearly impossible
to localize. Patches with large contrast changes (gradients) are easier to localize, although
straight line segments at a single orientation suffer from the aperture problem (Horn and
Schunck 1981; Lucas and Kanade 1981; Anandan 1989), i.e., it is only possible to align
the patches along the direction normal to the edge direction (Figure 4.4b). Patches with
gradients in at least two (significantly) different orientations are the easiest to localize, as
shown schematically in Figure 4.4a.
These intuitions can be formalized by looking at the simplest possible matching criterion
for comparing two image patches, i.e., their (weighted) summed square difference,
2
E WSSD (u)= w(x i )[I 1 (x i + u) − I 0 (x i )] , (4.1)
i
where I 0 and I 1 are the two images being compared, u =(u, v) is the displacement vector,
w(x) is a spatially varying weighting (or window) function, and the summation i is over all
the pixels in the patch. Note that this is the same formulation we later use to estimate motion
between complete images (Section 8.1).
When performing feature detection, we do not know which other image locations the
feature will end up being matched against. Therefore, we can only compute how stable this
metric is with respect to small variations in position Δu by comparing an image patch against