Page 55 - Dynamic Vision for Perception and Control of Motion
P. 55

2.1 Three-dimensional (3-D) Space and Time      39



              The element of the Jacobian matrix linked to the horizontal (y i) feature at point
            x Fk in the real world and to the unknown state variable xS U now becomes
                                      G
                    J  kȡy   w y/ x Sȡ    G  y pȡ  / x    y ˜ (e Dȡ2  / e    e Dȡ4  e /  N4  ).  (2.24)
                             w
                                        Sȡ
                           k
                                                    N2
                                            pN
              The corresponding relation for the vertical feature position in the image is ob-
            tained in a similar way as
                                      G
                             w
                     J   w z  / x   G  z  / x    z  ˜ (e  / e    e  / e ).  (2.25)
                      kȡz  k   Sȡ   pȡ  Sȡ  pN  Dȡ3  N3  Dȡ4  N4
              This approach is a very flexible scheme for obtaining the entries into the Jaco-
            bian matrix efficiently. Adaptations to changing scene trees, due to new objects
            appearing with knew unknown states to be determined visually, can thus be made
            in an easy way.
              The general approach discussed leaves two variants open to be selected for the
            actual case at hand:
            1. Very few feature points for an object: In this case, it may be more economic
              with respect to computational load to multiply the sequence of transformations
              in Figure  2.11 from  the left by the homogeneous 3-D feature point x Fk  (four
              components). This always requires only four inner vector products (= 25% of a ma-
              trix product).  So, in total, for 6  matrix vector products, 24 inner products  are
              needed; for the 7 expressions in Figure 2.11, a total of 168 such products result.
            2. Many  feature points  on an object: Multiplying (concatenating) the  elemental
              transformation matrices for the seven expressions in Figure 2.11 from right to
              left, in a naive approach requires at most 16 · 5 · 7 = 560 inner vector products.
              For each feature point in the real world on a single object, 7·4 = 28 inner vector
              products have to be added to obtain the e-vector and its six partial derivatives.
              Asking for the number of features m on an object for which this approach is
              more economic as the one above, the relation m · 168 = 560 + m · 28 has to be
              solved for m as the break-even point, yielding m = 560/140 = 4.
              So for more than four features on a single object, in our case with six unknowns
            in five transformation matrices plus perspective projection, the concatenation of
            transformation matrices first, and the multiplication with the coordinates of the fea-
            ture points xFk afterward, is more computer-efficient.
              Considering the fact that the derivative matrices are sparsely filled, as discussed
            above, and that many matrix products can be reused, frequently more than once,
            concatenation, performed as standard method in computer graphics, also becomes
            of interest in computer vision. However, as Figure 2.11 shows, much larger mem-
            ory space has to be allotted for the iteration of transformation variables (the partial
            derivative matrices and their products). Note that to the left of derivative matrices
            of translations, also just a vector results for all further products, as in method 1
            above. Taking advantage of all these points, method 2 is usually more efficient for
            more that two to three feature points on an object.


            2.1.3 Time Representation

            Time is considered an independent variable, monotonically increasing at a constant
            rate (as a good approximation to experience in the spatiotemporal domain of inter-
            est here). The temporal resolution required of measurement and control processes
   50   51   52   53   54   55   56   57   58   59   60