Page 217 -
P. 217

196                                                          4 Feature detection and matching














                Figure 4.16  Feature matching: how can we extract local descriptors that are invariant to inter-image variations
                and yet still discriminative enough to establish correct correspondences?



                                and motion. Tuytelaars and Van Gool (2004) use affine invariant regions to detect corre-
                                spondences for wide baseline stereo matching, whereas Kadir, Zisserman, and Brady (2004)
                                detect salient regions where patch entropy and its rate of change with scale are locally max-
                                imal. Corso and Hager (2005) use a related technique to fit 2D oriented Gaussian kernels
                                to homogeneous regions. More details on techniques for finding and matching curves, lines,
                                and regions can be found later in this chapter.


                                4.1.2 Feature descriptors

                                After detecting features (keypoints), we must match them, i.e., we must determine which
                                features come from corresponding locations in different images. In some situations, e.g., for
                                video sequences (Shi and Tomasi 1994) or for stereo pairs that have been rectified (Zhang,
                                Deriche, Faugeras et al. 1995; Loop and Zhang 1999; Scharstein and Szeliski 2002), the lo-
                                cal motion around each feature point may be mostly translational. In this case, simple error
                                metrics, such as the sum of squared differences or normalized cross-correlation, described
                                in Section 8.1 can be used to directly compare the intensities in small patches around each
                                feature point. (The comparative study by Mikolajczyk and Schmid (2005), discussed below,
                                uses cross-correlation.) Because feature points may not be exactly located, a more accurate
                                matching score can be computed by performing incremental motion refinement as described
                                in Section 8.1.3 but this can be time consuming and can sometimes even decrease perfor-
                                mance (Brown, Szeliski, and Winder 2005).
                                   In most cases, however, the local appearance of features will change in orientation and
                                scale, and sometimes even undergo affine deformations. Extracting a local scale, orientation,
                                or affine frame estimate and then using this to resample the patch before forming the feature
                                descriptor is thus usually preferable (Figure 4.17).
                                   Even after compensating for these changes, the local appearance of image patches will
                                usually still vary from image to image. How can we make image descriptors more invariant to
                                such changes, while still preserving discriminability between different (non-corresponding)
                                patches (Figure 4.16)? Mikolajczyk and Schmid (2005) review some recently developed
                                view-invariant local image descriptors and experimentally compare their performance. Be-
                                low, we describe a few of these descriptors in more detail.

                                Bias and gain normalization (MOPS).  For tasks that do not exhibit large amounts of
                                foreshortening, such as image stitching, simple normalized intensity patches perform reason-
                                ably well and are simple to implement (Brown, Szeliski, and Winder 2005) (Figure 4.17). In
   212   213   214   215   216   217   218   219   220   221   222