Page 102 - Biomimetics : Biologically Inspired Technologies
P. 102

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 88 21.9.2005 11:40pm




                    88                                      Biomimetics: Biologically Inspired Technologies



























                    Figure 3.9  Vision cognition architecture. The raw input to the visual system is a wide-angle high-resolution video
                    camera (large frame shown in the lower right of the figure). A subimage, of a permanently fixed size (say
                    1024   1024 pixels) of a single video frame (shown as a square within the large frame), termed the eyeball
                    image, is determined by the location of its center (depicted by the intersection of crosshairs), known as the fixation
                    point. The gaze controller uses the entire large frame to select a single fixation point, if it deems that such a
                    selection is warranted for this large frame (it only attempts to select a fixation point when processing of the last
                    eyeball image has been completed). For simplicity, it is assumed that the video camera is fixed and is able to see
                    the entire visual scene of interest (e.g., a camera viewing a busy downtown intersection). The confabulation
                    architecture used for visual processing is described in the text.



                    observer is viewing the video it is important that they be carrying out whatever specific task or
                    tasks that the automated vision system will be asked to carry out (e.g., spotting people, pets,
                    bicycles, and cars).
                       After many tens of hours of video have been viewed by the human observer carrying out the
                    function that the machine visual cognition system will later perform, and their eye movements
                    have been recorded, this provides a record of their fixation point choices for each still frame of
                    specific scene content when that choice was made. This record is then used to train a multi-layer
                    perceptron (Hecht-Nielsen, 2004) to carry out the gaze control function. The basic idea is simple.
                    Each frame of high-resolution video is described by an image feature vector V. This feature vector
                    is produced by first taking the inner product of each of a collection of Gabor logons with the image
                    frame (both considered as vectors of the same dimension). The specific Gabor logons used in
                    forming V (each logon is defined by the constants E, F, and G, and by its position and angle of plane
                    rotation in the image — see Figure 3.10) are now described.
                       First, we create a fixed rectangular set of gridpoints located at equal pixel spacings across the
                    entire high-resolution video camera frame (Caid and Hecht-Nielsen, 2001, 2004; Daugman, 1985,
                    1987, 1988a,b; Daugman and Kammen, 1987; Hecht-Nielsen, 1990; Hecht-Nielsen and Zhou,
                    1995). For example, if each video camera image frame were a 8,192   8,192 pixel digital image,
                    with a 16-bit panchromatic grayscale, or equivalently, a 67,108,864-dimensional floating point real
                    vector with integer components between 0 and 65,535, then we might have gridpoints spaced every
                    16 pixels vertically and horizontally, with gridpoints on the image edges, for a total of 513   513 ¼
                    263,169 gridpoints.
                       At each gridpoint we create a set of Gabor logons centered at that position, each having a
                    specified rotation angle and E, F, and G values. The set of logons at each gridpoint is exactly the
                    same, save for their translated position. This set, which is now described, is termed a jet (von der
   97   98   99   100   101   102   103   104   105   106   107