Page 168 -
P. 168
3.6 Geometric transformations 147
windowed sinc functions. While bilinear is often used for speed (e.g., inside the inner loop
of a patch-tracking algorithm, see Section 8.1.3), bicubic, and windowed sinc are preferable
where visual quality is important.
To compute the value of f(x) at a non-integer location x, we simply apply our usual FIR
resampling filter,
g(x, y)= f(k, l)h(x − k, y − l), (3.89)
k,l
where (x, y) are the sub-pixel coordinate values and h(x, y) is some interpolating or smooth-
ing kernel. Recall from Section 3.5.2 that when decimation is being performed, the smoothing
kernel is stretched and re-scaled according to the downsampling rate r.
Unfortunately, for a general (non-zoom) image transformation, the resampling rate r is
not well defined. Consider a transformation that stretches the x dimensions while squashing
the y dimensions. The resampling kernel should be performing regular interpolation along
the x dimension and smoothing (to anti-alias the blurred image) in the y direction. This gets
even more complicated for the case of general affine or perspective transforms.
What can we do? Fortunately, Fourier analysis can help. The two-dimensional general-
ization of the one-dimensional domain scaling law given in Table 3.1 is
−1 −T
g(Ax) ⇔|A| G(A f). (3.90)
For all of the transforms in Table 3.5 except perspective, the matrix A is already defined.
For perspective transformations, the matrix A is the linearized derivative of the perspective
transformation (Figure 3.48a), i.e., the local affine approximation to the stretching induced
by the projection (Heckbert 1986; Wolberg 1990; Gomes, Darsa, Costa et al. 1999; Akenine-
M¨ oller and Haines 2002).
To prevent aliasing, we need to pre-filter the image f(x) with a filter whose frequency
−T
response is the projection of the final desired spectrum through the A transform (Szeliski,
Winder, and Uyttendaele 2010). In general (for non-zoom transforms), this filter is non-
separable and hence is very slow to compute. Therefore, a number of approximations to this
filter are used in practice, include MIP-mapping, elliptically weighted Gaussian averaging,
and anisotropic filtering (Akenine-M¨ oller and Haines 2002).
MIP-mapping
MIP-mapping was first proposed by Williams (1983) as a means to rapidly pre-filter images
being used for texture mapping in computer graphics. A MIP-map 18 is a standard image
pyramid (Figure 3.32), where each level is pre-filtered with a high-quality filter rather than
a poorer quality approximation, such as Burt and Adelson’s (1983b) five-tap binomial. To
resample an image from a MIP-map, a scalar estimate of the resampling rate r is first com-
puted. For example, r can be the maximum of the absolute values in A (which suppresses
aliasing) or it can be the minimum (which reduces blurring). Akenine-M¨ oller and Haines
(2002) discuss these issues in more detail.
Once a resampling rate has been specified, a fractional pyramid level is computed using
the base 2 logarithm,
l = log r. (3.91)
2
18 The term ‘MIP’ stands for multi in parvo, meaning ‘many in one’.