Page 205 -
P. 205
184 4 Feature detection and matching
Figure 4.2 Two pairs of images to be matched. What kinds of feature might one use to establish a set of
correspondences between these images?
detect features in all the images under consideration and then match features based on their
local appearance (Section 4.1.3). The former approach is more suitable when images are
taken from nearby viewpoints or in rapid succession (e.g., video sequences), while the lat-
ter is more suitable when a large amount of motion or appearance change is expected, e.g.,
in stitching together panoramas (Brown and Lowe 2007), establishing correspondences in
wide baseline stereo (Schaffalitzky and Zisserman 2002), or performing object recognition
(Fergus, Perona, and Zisserman 2007).
In this section, we split the keypoint detection and matching pipeline into four separate
stages. During the feature detection (extraction) stage (Section 4.1.1), each image is searched
for locations that are likely to match well in other images. At the feature description stage
(Section 4.1.2), each region around detected keypoint locations is converted into a more com-
pact and stable (invariant) descriptor that can be matched against other descriptors. The
feature matching stage (Section 4.1.3) efficiently searches for likely matching candidates in
other images. The feature tracking stage (Section 4.1.4) is an alternative to the third stage
that only searches a small neighborhood around each detected feature and is therefore more
suitable for video processing.
A wonderful example of all of these stages can be found in David Lowe’s (2004) paper,
which describes the development and refinement of his Scale Invariant Feature Transform
(SIFT). Comprehensive descriptions of alternative techniques can be found in a series of
survey and evaluation papers covering both feature detection (Schmid, Mohr, and Bauck-
hage 2000; Mikolajczyk, Tuytelaars, Schmid et al. 2005; Tuytelaars and Mikolajczyk 2007)
and feature descriptors (Mikolajczyk and Schmid 2005). Shi and Tomasi (1994) and Triggs
(2004) also provide nice reviews of feature detection techniques.