Page 307 -
P. 307

286                                                                6 Feature-based alignment


                                 2
                                d can computed as ratios of successive d 2n+2 /d 2n  estimates and these can be averaged to
                                 i
                                                                        i
                                                                  i
                                                     2
                                obtain a final estimate of d (and hence d i ).
                                                     i
                                   Once the individual estimates of the d i distances have been computed, we can generate
                                a 3D structure consisting of the scaled point directions d i ˆ x i , which can then be aligned with
                                the 3D point cloud {p } using absolute orientation (Section 6.1.5) to obtained the desired
                                                  i
                                pose estimate. Quan and Lan (1999) give accuracy results for this and other techniques,
                                which use fewer points but require more complicated algebraic manipulations. The paper by
                                Moreno-Noguer, Lepetit, and Fua (2007) reviews more recent alternatives and also gives a
                                lower complexity algorithm that typically produces more accurate results.
                                   Unfortunately, because minimal PnP solutions can be quite noise sensitive and also suffer
                                from bas-relief ambiguities (e.g., depth reversals) (Section 7.4.3), it is often preferable to use
                                the linear six-point algorithm to guess an initial pose and then optimize this estimate using
                                the iterative technique described in Section 6.2.2.
                                   An alternative pose estimation algorithm involves starting with a scaled orthographic pro-
                                jection model and then iteratively refining this initial estimate using a more accurate perspec-
                                tive projection model (DeMenthon and Davis 1995). The attraction of this model, as stated
                                in the paper’s title, is that it can be implemented “in 25 lines of [Mathematica] code”.


                                6.2.2 Iterative algorithms

                                The most accurate (and flexible) way to estimate pose is to directly minimize the squared (or
                                robust) reprojection error for the 2D points as a function of the unknown pose parameters in
                                (R, t) and optionally K using non-linear least squares (Tsai 1987; Bogart 1991; Gleicher
                                and Witkin 1992). We can write the projection equations as

                                                            x i = f(p ; R, t, K)                     (6.42)
                                                                    i
                                and iteratively minimize the robustified linearized reprojection errors

                                                            ∂f       ∂f      ∂f

                                              E NLP =    ρ     ΔR +     Δt +    ΔK − r i ,           (6.43)
                                                            ∂R       ∂t      ∂K
                                                       i
                                where r i = ˜x i − ˆ x i is the current residual vector (2D error in predicted position) and the
                                partial derivatives are with respect to the unknown pose parameters (rotation, translation, and
                                optionally calibration). Note that if full 2D covariance estimates are available for the 2D
                                feature locations, the above squared norm can be weighted by the inverse point covariance
                                matrix, as in Equation (6.11).
                                   An easier to understand (and implement) version of the above non-linear regression prob-
                                lem can be constructed by re-writing the projection equations as a concatenation of simpler
                                steps, each of which transforms a 4D homogeneous coordinate p by a simple transformation
                                                                                   i
                                such as translation, rotation, or perspective division (Figure 6.5). The resulting projection
                                equations can be written as
                                                     y (1)  =  f (p ; c j )= p − c j ,               (6.44)
                                                               T  i       i
                                                     y (2)  =  f (y (1) ; q )= R(q ) y (1) ,         (6.45)
                                                               R      j       j
                                                                        y (2)
                                                      (3)
                                                                  (2)
                                                     y    =  f (y   )=     ,                         (6.46)
                                                               P
                                                                        z (2)
                                                          =  f (y (3) ; k)= K(k) y (3) .             (6.47)
                                                      x i      C
   302   303   304   305   306   307   308   309   310   311   312