Page 404 -
P. 404

Section 12.1  Registering Rigid Objects  372



















                            FIGURE 12.2: On the left, frames from a video of an aircraft in the air. These frames are
                            rectified to one another to form a mosaic on the right, which reveals (a) the flight path
                            of the aircraft in the video and (b) the flight path of the observer. Notice that mosaic
                            reveals the speed with which the aircraft is moving (see how far apart each instance of the
                            aircraft is in the mosaic; when they are far apart, it is moving quickly). This figure was
                            originally published as Figure 4 of “Video Indexing Based on Mosaic Representations,” by
                            M. Irani and P. Anandan, Proc. IEEE, v86 n5, 1998, c   IEEE, 1998.


                            the squared matching error. Brown and Lowe (2003) show one strategy for finding
                            tokens; they find the interest points of Section 5.3, then compute SIFT features for
                            the neighborhoods (as in Section 5.4.1), and then use approximate nearest neighbors
                            methods to find matching pairs (as in Section 21.2.3). A small set of matches is
                            sufficient to fit a transformation.
                                 There are two types of transformation that are useful in this context. In the
                            simplest case, the camera is an orthographic camera, and it translated. In turn,
                            this means that image tokens translate, so we need only estimate a translation
                            that places matching tokens on top of one another. In a more complex case, the
                            camera is a perspective camera that rotates about its focal point. If we know
                            nothing about the camera, the map between the relevant portions of I 1 and I 2 is a
                            plane projective transformation, sometimes known as a homography. Knowing more
                            about the camera and the circumstances might result in a more tightly constrained
                            transformation.
                                 In homogeneous coordinates, the transformation that takes the point x 1 =
                            (x 1 ,y 1 , 1) in I 1 to its corresponding point in I 2 , x 2 =(x 2 ,y 2 , 1), has the form
                            of a generic 3 × 3 matrix with nonzero determinant. Write H for this matrix.
                            We can estimate its elements using four corresponding points on the plane. Write
                             (i)    (i)  (i)                                        (i)   (i)  (i)
                            x 1  =(x ,y , 1) for the ith point in I 1 , which corresponds to x 2  =(x ,y , 1).
                                       1
                                                                                          2
                                                                                              2
                                    1
                            Nowwehave
                                                          ⎛                  ⎞
                                                             h 11 x  (i) +h 12 y (i)
                                                                        +h 13
                                                   (i)           1     1
                                                  x       ⎜ h 31 x  (i) +h 32 y (i)
                                                   2             1     1  +h 33 ⎟
                                                                             ⎠ ,
                                                   (i)  = ⎝  h 21 x  (i) +h 22 y (i)
                                                  y 2            1     1  +h 23
                                                             h 31 x  (i) +h 32 y (i)  +h 33
                                                                 1     1
                            so that if we cross-multiply and subtract, we get two homogeneous linear equations
   399   400   401   402   403   404   405   406   407   408   409