Page 103 - Biomimetics : Biologically Inspired Technologies
P. 103

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 89 21.9.2005 11:40pm




                    Mechanization of Cognition                                                   89










































                    Figure 3.10  Gabor logon local image features. Logons are defined as images with real-valued pixel brightnesses
                    (i.e., both positive and negative values are allowed) defined by geometrical plane rotations and translations (in the
                                                                          2
                                                                     2
                    image plane) of the canonical two-dimensional functions sin(Gx)/exp(Ex þ Fy ) (called a sine logon) and cos(Gx)/
                             2
                         2
                    exp(Ex þ Fy ) (termed a cosine logon); where E, F, and G are positive constants and x and y are image plane
                    coordinates in the translated and rotated coordinates of the logon frame. Note that E and F define the principal axis
                    lengths of a two-dimensional Gaussian-type ellipsoid and G defines the spatial frequency of a plane grating (with
                    oscillations along the x-axis). The ratio E/F is fixed for all logons used. Each individual logon is considered as a (real
                    valued) digital image, that is as image vectors of the same dimension as the wide-angle video camera frames.
                    Malsburg, 1990). In the vision work done in my lab we have typically set the ratio E/F to 5/8 and the
                    G/E ratio to 3p/2 for every logon in every jet (these are the values that seem to be used by domestic
                    cats; Hecht-Nielsen, 1990).
                      Each jet consists, for example, of pairs of a sine logon and a cosine logon at each of seven scales
                    (E ¼ 2, 3, 5, 9, 15, 20, and 35 pixel units) and 16 regularly spaced angular orientations, including
                    having the major ellipse axis of one logon pair vertical. Thus, each jet at each gridpoint has 224
                    logons. Again, each individual logon is viewed as a 67,108,864-dimensional floating point real
                    vector, with each component value given by the evaluation of its formula (Figure 3.10, properly
                    translated and rotated) evaluated at the pixel location corresponding to that component (obviously,
                    with most of its values at pixels distant from the gridpoint very close to zero). Thus, there are a total
                    of 224   263,169 ¼ 58,949,856 logons in all of the gridpoint jets; almost as many as there are
                    pixels in the high-resolution camera image.
                      The image feature vector V of a single camera (assumed to employ a progressive scan) frame
                    is defined to be the 29,474,928-dimensional vector obtained by first calculating the inner product
                    of each logon of each jet with the image vector, and then, to get each component of V, adding
   98   99   100   101   102   103   104   105   106   107   108