Page 206 -
P. 206

4.1 Points and patches                                                                 185





























               Figure 4.3 Image pairs with extracted patches below. Notice how some patches can be localized or matched
               with higher accuracy than others.


               4.1.1 Feature detectors


               How can we find image locations where we can reliably find correspondences with other
               images, i.e., what are good features to track (Shi and Tomasi 1994; Triggs 2004)? Look again
               at the image pair shown in Figure 4.3 and at the three sample patches to see how well they
               might be matched or tracked. As you may notice, textureless patches are nearly impossible
               to localize. Patches with large contrast changes (gradients) are easier to localize, although
               straight line segments at a single orientation suffer from the aperture problem (Horn and
               Schunck 1981; Lucas and Kanade 1981; Anandan 1989), i.e., it is only possible to align
               the patches along the direction normal to the edge direction (Figure 4.4b). Patches with
               gradients in at least two (significantly) different orientations are the easiest to localize, as
               shown schematically in Figure 4.4a.
                  These intuitions can be formalized by looking at the simplest possible matching criterion
               for comparing two image patches, i.e., their (weighted) summed square difference,



                                                                      2
                                E WSSD (u)=    w(x i )[I 1 (x i + u) − I 0 (x i )] ,  (4.1)
                                             i
               where I 0 and I 1 are the two images being compared, u =(u, v) is the displacement vector,
               w(x) is a spatially varying weighting (or window) function, and the summation i is over all
               the pixels in the patch. Note that this is the same formulation we later use to estimate motion
               between complete images (Section 8.1).
                  When performing feature detection, we do not know which other image locations the
               feature will end up being matched against. Therefore, we can only compute how stable this
               metric is with respect to small variations in position Δu by comparing an image patch against
   201   202   203   204   205   206   207   208   209   210   211