Page 109 - Biomimetics : Biologically Inspired Technologies
P. 109

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 95 21.9.2005 11:40pm




                    Mechanization of Cognition                                                   95

                      During each secondary layer training episode, sequences of say, four to six consecutive eyeball
                    images of the same fixation point on a dynamically pose-changing object of interest (extracted at
                    random from one of the training set sequences) are used in order. As described above, as each
                    eyeball image in the sequence is entered, the above segmenting process is applied to it — yielding
                    expectations on a subset of the primary layer lexicons. After the first eyeball image of the first
                    training episode sequence has been so represented, a symbol is formed in each secondary lexicon
                    and that symbol is bi-directionally connected from and to each of the primary layer symbols to
                    which it is connected (by setting the relevant knowledge link strengths to 1). The second eyeball
                    image of the sequence is then entered and segmented. The same secondary lexicon symbols created
                    using the first eyeball image of the sequence are then connected from and to all of the primary
                    symbols of this processed second eyeball image for which connections exist. And so forth for the
                    remaining eyeball images of the sequence used on this first training episode. On subsequent training
                    episodes we proceed in exactly the same manner.
                      Clearly, one symbol is typically going to be added to each secondary lexicon on each training
                    trial. We stop training when the vast majority of new secondary symbols turn out to be equivalent to
                    existing symbols — as measured by noting that, of those secondary lexicons which are receiving
                    knowledge link inputs from primary symbols, each such secondary lexicon already has a symbol
                    that simultaneously receives links from at least one expectation symbol of each nonnull primary
                    lexicon from which that secondary lexicon receives a knowledge base. In other words, training is
                    stopped when the vast majority of segmented eyeball images can be well represented by secondary
                    symbols which have already been created.
                      Once training has been stopped, we then use the same training set again for consolidating the
                    symbols. This involves using the primary symbols representing each eyeball image as inputs to
                    the secondary lexicons to which they connect by strengthened connections. Whenever multiple
                    symbols of a secondary lexicon receive links from one primary symbol of each primary lexicon
                    from which that secondary lexicon receives links, then those secondary lexicon symbols are
                    merged. Merging simply means that all of the primary to secondary links that went to each of the
                    symbols being merged now go to the merged symbol (and vice versa for the secondary to primary
                    links). What merging does is combine symbols which represent intersecting pose–space trajectories
                    for the same object; thus, increasing the pose-insensitivity of the merged symbol.
                      Once the secondary layer lexicons are built and merged (and the knowledge bases between
                    the primary and secondary layers frozen), the last step is to train the knowledge bases between the
                    secondary layer lexicons. This is done by entering single eyeball images from the training set,
                    segmenting and representing each image using the primary layer (as during training), carrying out a
                    W on each secondary lexicon, and recording the symbol co-occurrence counts for each secondary
                    layer knowledge base.
                      When all of this is done, the secondary to tertiary knowledge bases (and their inverses) are built
                    using the same method as described above for the primary to secondary knowledge bases. Except
                    this time, each training episode uses the entire set of eyeball images of each training set sequence.
                    The resulting tertiary lexicon symbols are then merged and the tertiary layer inter-lexicon know-
                    ledge bases are built. This completes development of the vision module.

                    3.5.4 How Is the Visual Module Used?

                    After all of the lexicons and knowledge bases of the visual module of Figure 3.9 are built, the
                    module is ready for use. This subsection briefly sketches how it can be used.
                      Given a new frame of imagery in which the gaze controller has found a fixation point,
                    the primary layer of the visual module segments and represents the attended object with expect-
                    ations; just as during the later phases of training and education. The symbols of the non-null
                    expectations of primary lexicons then transmit to other primary lexicons and to secondary layer
                    lexicons via the established knowledge bases. The other primary layer and the secondary layer
   104   105   106   107   108   109   110   111   112   113   114