Page 406 -
P. 406
Section 12.1 Registering Rigid Objects 374
FIGURE 12.4: The two mountain images of Figure 12.3, now rectified with a homography.
Notice how well all features line up; this transformation involves more than just rotation
and translation, as you can see from the fact that the corner of the second image (which can
be seen in the middle, near the top), is no longer a right angle. Notice also that intensity
effects in the camera far field mean that the boundary where the two images overlap is
unpleasantly obvious. This figure was originally published as Figure 1 M. Brown and D.
Lowe, “Recognizing Panoramas,” Proc. ICCV 2003, c IEEE, 2003.
Registering images into mosaics gets more interesting when there are more
than two images. Imagine we have three images, I 1 , I 2 ,and I 3 . We could register
image one to image two, then image two to image three. But, if image three has
some features that match to features in image one, this might not be wise. Write
T 2→1 for the estimated transformation that takes image two into image one’s frame
(and so on). The problem is that T 2→1 ◦T 3→2 might not be a good estimate of
T 3→1 the transformation from image three’s frame to image one’s frame. The error
might not be all that large in the case of just three images, but it can accumulate.
To deal with this accumulation, we need some method to estimate all registra-
tions in one go, using all error terms. Doing so is often called bundle adjustment,by
analogy with the relevant term in structure from motion (Section 8.3.3). A natural
method is to choose a coordinate frame within which to work—for example, the
frame of the first image—then search for a set of maps that take each other image
into that frame and minimize the sum of squared errors between all matching pairs
(i) (k)
of points. For our example, write x , x for the jth tuple consisting of a point
j
x (i) in image i that matches a point x (k) in image k. We would estimate T 2→1 and

