Page 216 -

P. 216

4.1 Points and patches 195

x 0 → x → −1/2

0
−1/2 A 1 x 1
A 0 x 0 Rx 1 ← x 1

Figure 4.14 Afﬁne normalization using the second moment matrices, as described by Mikolajczyk, Tuytelaars,
−1/2
Schmid et al. (2005) c 2005 Springer. After image coordinates are transformed using the matrices A and
0
−1/2
A , they are related by a pure rotation R, which can be estimated using a dominant orientation technique.
1

Figure 4.15 Maximally stable extremal regions (MSERs) extracted and matched from a number of images
(Matas, Chum, Urban et al. 2004) c 2004 Elsevier.

2000; Mikolajczyk and Schmid 2004; Mikolajczyk, Tuytelaars, Schmid et al. 2005; Tuyte-
laars and Mikolajczyk 2007). Figure 4.14 shows how the square root of the moment matrix
can be used to transform local patches into a frame which is similar up to rotation.
Another important afﬁne invariant region detector is the maximally stable extremal region
(MSER) detector developed by Matas, Chum, Urban et al. (2004). To detect MSERs, binary
regions are computed by thresholding the image at all possible gray levels (the technique
therefore only works for grayscale images). This operation can be performed efﬁciently by
ﬁrst sorting all pixels by gray value and then incrementally adding pixels to each connected
component as the threshold is changed (Nist´ er and Stew´ enius 2008). As the threshold is
changed, the area of each component (region) is monitored; regions whose rate of change of
area with respect to the threshold is minimal are deﬁned as maximally stable. Such regions
are therefore invariant to both afﬁne geometric and photometric (linear bias-gain or smooth
monotonic) transformations (Figure 4.15). If desired, an afﬁne coordinate frame can be ﬁt to
each detected region using its moment matrix.
The area of feature point detectors continues to be very active, with papers appearing ev-
ery year at major computer vision conferences (Xiao and Shah 2003; Koethe 2003; Carneiro
and Jepson 2005; Kenney, Zuliani, and Manjunath 2005; Bay, Tuytelaars, and Van Gool 2006;
Platel, Balmachnova, Florack et al. 2006; Rosten and Drummond 2006). Mikolajczyk, Tuyte-
laars, Schmid et al. (2005) survey a number of popular afﬁne region detectors and provide
experimental comparisons of their invariance to common image transformations such as scal-
ing, rotations, noise, and blur. These experimental results, code, and pointers to the surveyed
papers can be found on their Web site at http://www.robots.ox.ac.uk/ vgg/research/afﬁne/.
∼
Of course, keypoints are not the only features that can be used for registering images.
Zoghlami, Faugeras, and Deriche (1997) use line segments as well as point-like features to
estimate homographies between pairs of images, whereas Bartoli, Coquerelle, and Sturm
(2004) use line segments with local correspondences along the edges to extract 3D structure

211 212 213 214 215 216 217 218 219 220 221