Page 404 -
P. 404
Section 12.1 Registering Rigid Objects 372
FIGURE 12.2: On the left, frames from a video of an aircraft in the air. These frames are
rectified to one another to form a mosaic on the right, which reveals (a) the flight path
of the aircraft in the video and (b) the flight path of the observer. Notice that mosaic
reveals the speed with which the aircraft is moving (see how far apart each instance of the
aircraft is in the mosaic; when they are far apart, it is moving quickly). This figure was
originally published as Figure 4 of “Video Indexing Based on Mosaic Representations,” by
M. Irani and P. Anandan, Proc. IEEE, v86 n5, 1998, c IEEE, 1998.
the squared matching error. Brown and Lowe (2003) show one strategy for finding
tokens; they find the interest points of Section 5.3, then compute SIFT features for
the neighborhoods (as in Section 5.4.1), and then use approximate nearest neighbors
methods to find matching pairs (as in Section 21.2.3). A small set of matches is
sufficient to fit a transformation.
There are two types of transformation that are useful in this context. In the
simplest case, the camera is an orthographic camera, and it translated. In turn,
this means that image tokens translate, so we need only estimate a translation
that places matching tokens on top of one another. In a more complex case, the
camera is a perspective camera that rotates about its focal point. If we know
nothing about the camera, the map between the relevant portions of I 1 and I 2 is a
plane projective transformation, sometimes known as a homography. Knowing more
about the camera and the circumstances might result in a more tightly constrained
transformation.
In homogeneous coordinates, the transformation that takes the point x 1 =
(x 1 ,y 1 , 1) in I 1 to its corresponding point in I 2 , x 2 =(x 2 ,y 2 , 1), has the form
of a generic 3 × 3 matrix with nonzero determinant. Write H for this matrix.
We can estimate its elements using four corresponding points on the plane. Write
(i) (i) (i) (i) (i) (i)
x 1 =(x ,y , 1) for the ith point in I 1 , which corresponds to x 2 =(x ,y , 1).
1
2
2
1
Nowwehave
⎛ ⎞
h 11 x (i) +h 12 y (i)
+h 13
(i) 1 1
x ⎜ h 31 x (i) +h 32 y (i)
2 1 1 +h 33 ⎟
⎠ ,
(i) = ⎝ h 21 x (i) +h 22 y (i)
y 2 1 1 +h 23
h 31 x (i) +h 32 y (i) +h 33
1 1
so that if we cross-multiply and subtract, we get two homogeneous linear equations

