Page 204 -
P. 204
4.1 Points and patches 183
Feature detection and matching are an essential component of many computer vision appli-
cations. Consider the two pairs of images shown in Figure 4.2. For the first pair, we may
wish to align the two images so that they can be seamlessly stitched into a composite mosaic
(Chapter 9). For the second pair, we may wish to establish a dense set of correspondences so
that a 3D model can be constructed or an in-between view can be generated (Chapter 11). In
either case, what kinds of features should you detect and then match in order to establish such
an alignment or set of correspondences? Think about this for a few moments before reading
on.
The first kind of feature that you may notice are specific locations in the images, such as
mountain peaks, building corners, doorways, or interestingly shaped patches of snow. These
kinds of localized feature are often called keypoint features or interest points (or even corners)
and are often described by the appearance of patches of pixels surrounding the point location
(Section 4.1). Another class of important features are edges, e.g., the profile of mountains
against the sky, (Section 4.2). These kinds of features can be matched based on their orien-
tation and local appearance (edge profiles) and can also be good indicators of object bound-
aries and occlusion events in image sequences. Edges can be grouped into longer curves and
straight line segments, which can be directly matched or analyzed to find vanishing points
and hence internal and external camera parameters (Section 4.3).
In this chapter, we describe some practical approaches to detecting such features and
also discuss how feature correspondences can be established across different images. Point
features are now used in such a wide variety of applications that it is good practice to read and
implement some of the algorithms from (Section 4.1). Edges and lines provide information
that is complementary to both keypoint and region-based descriptors and are well-suited to
describing object boundaries and man-made objects. These alternative descriptors, while
extremely useful, can be skipped in a short introductory course.
4.1 Points and patches
Point features can be used to find a sparse set of corresponding locations in different im-
ages, often as a pre-cursor to computing camera pose (Chapter 7), which is a prerequisite for
computing a denser set of correspondences using stereo matching (Chapter 11). Such corre-
spondences can also be used to align different images, e.g., when stitching image mosaics or
performing video stabilization (Chapter 9). They are also used extensively to perform object
instance and category recognition (Sections 14.3 and 14.4). A key advantage of keypoints
is that they permit matching even in the presence of clutter (occlusion) and large scale and
orientation changes.
Feature-based correspondence techniques have been used since the early days of stereo
matching (Hannah 1974; Moravec 1983; Hannah 1988) and have more recently gained pop-
ularity for image-stitching applications (Zoghlami, Faugeras, and Deriche 1997; Brown and
Lowe 2007) as well as fully automated 3D modeling (Beardsley, Torr, and Zisserman 1996;
Schaffalitzky and Zisserman 2002; Brown and Lowe 2003; Snavely, Seitz, and Szeliski 2006).
There are two main approaches to finding feature points and their correspondences. The
first is to find features in one image that can be accurately tracked using a local search tech-
nique, such as correlation or least squares (Section 4.1.4). The second is to independently