Page 219 -
P. 219
198 4 Feature detection and matching
(a) image gradients (b) keypoint descriptor
Figure 4.18 A schematic representation of Lowe’s (2004) scale invariant feature transform (SIFT): (a) Gradient
orientations and magnitudes are computed at each pixel and weighted by a Gaussian fall-off function (blue circle).
(b) A weighted gradient orientation histogram is then computed in each subregion, using trilinear interpolation.
While this figure shows an 8 × 8 pixel patch and a 2 × 2 descriptor array, Lowe’s actual implementation uses
16 × 16 patches and a 4 × 4 array of eight-bin histograms.
integrals used in SIFT.
Gradient location-orientation histogram (GLOH). This descriptor, developed by Miko-
lajczyk and Schmid (2005), is a variant on SIFT that uses a log-polar binning structure instead
of the four quadrants used by Lowe (2004) (Figure 4.19). The spatial bins are of radius 6,
11, and 15, with eight angular bins (except for the central region), for a total of 17 spa-
tial bins and 16 orientation bins. The 272-dimensional histogram is then projected onto
a 128-dimensional descriptor using PCA trained on a large database. In their evaluation,
Mikolajczyk and Schmid (2005) found that GLOH, which has the best performance overall,
outperforms SIFT by a small margin.
Steerable filters. Steerable filters (Section 3.2.3) are combinations of derivative of Gaus-
sian filters that permit the rapid computation of even and odd (symmetric and anti-symmetric)
edge-like and corner-like features at all possible orientations (Freeman and Adelson 1991).
Because they use reasonably broad Gaussians, they too are somewhat insensitive to localiza-
tion and orientation errors.
Performance of local descriptors. Among the local descriptors that Mikolajczyk and
Schmid (2005) compared, they found that GLOH performed best, followed closely by SIFT
(see Figure 4.25). They also present results for many other descriptors not covered in this
book.
The field of feature descriptors continues to evolve rapidly, with some of the newer tech-
niques looking at local color information (van de Weijer and Schmid 2006; Abdel-Hakim
and Farag 2006). Winder and Brown (2007) develop a multi-stage framework for feature
descriptor computation that subsumes both SIFT and GLOH (Figure 4.20a) and also allows