Page 301 -
P. 301

280                                                                6 Feature-based alignment


                                However, this is not optimal from a statistical point of view, since the denominator D, which
                                was used to multiply each equation, can vary quite a bit from point to point. 6
                                   One way to compensate for this is to reweight each equation by the inverse of the current
                                estimate of the denominator, D,

                                                                                       ⎡     ⎤
                                                                                         h 00


                                     1   ˆ x − x    1   xy    1  0  0  0  −ˆx x −ˆx y      .


                                                 =                                     ⎢   .  ⎦ .    (6.22)
                                                                                             ⎥

                                     D   ˆ y − y   D    0  0  0  xy    1  −ˆy x  −ˆy y  ⎣  .


                                                                                         h 21
                                While this may at first seem to be the exact same set of equations as (6.21), because least
                                squares is being used to solve the over-determined set of equations, the weightings do matter
                                and produce a different set of normal equations that performs better in practice.
                                   The most principled way to do the estimation, however, is to directly minimize the squared
                                residual equations (6.13) using the Gauss–Newton approximation, i.e., performing a first-
                                order Taylor series expansion in p, as shown in (6.14), which yields the set of equations
                                                                                      ⎡      ⎤
                                                                                        Δh 00
                                       ˆ x − ˜x    1  xy    1  0  0  0  −˜x x −˜x y       .




                                               =                                      ⎢   .  ⎦ .     (6.23)
                                                                                             ⎥
                                       ˆ y − ˜y    D  0  0  0  xy    1  −˜y x  −˜y y  ⎣   .



                                                                                        Δh 21
                                While these look similar to (6.22), they differ in two important respects. First, the left hand
                                side consists of unweighted prediction errors rather than point displacements and the solution
                                vector is a perturbation to the parameter vector p. Second, the quantities inside J involve




                                predicted feature locations (˜x , ˜y ) instead of sensed feature locations (ˆx , ˆy ). Both of these
                                differences are subtle and yet they lead to an algorithm that, when combined with proper
                                checking for downhill steps (as in the Levenberg–Marquardt algorithm), will converge to a
                                local minimum. Note that iterating Equations (6.22) is not guaranteed to converge, since it is
                                not minimizing a well-defined energy function.
                                   Equation (6.23) is analogous to the additive algorithm for direct intensity-based regis-
                                tration (Section 8.2), since the change to the full transformation is being computed. If we
                                prepend an incremental homography to the current homography instead, i.e., we use a com-
                                positional algorithm (described in Section 8.2), we get D =1 (since p =0) and the above
                                formula simplifies to
                                                                                   ⎡       ⎤
                                                                                      Δh 00
                                                                          2

                                         ˆ x − x     xy    1  0  0  0  −x    −xy        .
                                                 =                                 ⎢    . .  ⎦ ,     (6.24)
                                                                                           ⎥

                                         ˆ y − y     0  0  0  xy    1  −xy   −y 2  ⎣
                                                                                      Δh 21
                                where we have replaced (˜x , ˜y ) with (x, y) for conciseness. (Notice how this results in the


                                same Jacobian as (8.63).)
                                  6  Hartley and Zisserman (2004) call this strategy of forming linear equations from rational equations the direct
                                linear transform, but that term is more commonly associated with pose estimation (Section 6.2). Note also that our
                                definition of the h ij parameters differs from that used in their book, since we define h ii to be the difference from
                                unity and we do not leave h 22 as a free parameter, which means that we cannot handle certain extreme homographies.
   296   297   298   299   300   301   302   303   304   305   306