Page 109 - Biomimetics : Biologically Inspired Technologies

P. 109

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 95 21.9.2005 11:40pm

Mechanization of Cognition 95

During each secondary layer training episode, sequences of say, four to six consecutive eyeball
images of the same ﬁxation point on a dynamically pose-changing object of interest (extracted at
random from one of the training set sequences) are used in order. As described above, as each
eyeball image in the sequence is entered, the above segmenting process is applied to it — yielding
expectations on a subset of the primary layer lexicons. After the ﬁrst eyeball image of the ﬁrst
training episode sequence has been so represented, a symbol is formed in each secondary lexicon
and that symbol is bi-directionally connected from and to each of the primary layer symbols to
which it is connected (by setting the relevant knowledge link strengths to 1). The second eyeball
image of the sequence is then entered and segmented. The same secondary lexicon symbols created
using the ﬁrst eyeball image of the sequence are then connected from and to all of the primary
symbols of this processed second eyeball image for which connections exist. And so forth for the
remaining eyeball images of the sequence used on this ﬁrst training episode. On subsequent training
episodes we proceed in exactly the same manner.
Clearly, one symbol is typically going to be added to each secondary lexicon on each training
trial. We stop training when the vast majority of new secondary symbols turn out to be equivalent to
existing symbols — as measured by noting that, of those secondary lexicons which are receiving
knowledge link inputs from primary symbols, each such secondary lexicon already has a symbol
that simultaneously receives links from at least one expectation symbol of each nonnull primary
lexicon from which that secondary lexicon receives a knowledge base. In other words, training is
stopped when the vast majority of segmented eyeball images can be well represented by secondary
symbols which have already been created.
Once training has been stopped, we then use the same training set again for consolidating the
symbols. This involves using the primary symbols representing each eyeball image as inputs to
the secondary lexicons to which they connect by strengthened connections. Whenever multiple
symbols of a secondary lexicon receive links from one primary symbol of each primary lexicon
from which that secondary lexicon receives links, then those secondary lexicon symbols are
merged. Merging simply means that all of the primary to secondary links that went to each of the
symbols being merged now go to the merged symbol (and vice versa for the secondary to primary
links). What merging does is combine symbols which represent intersecting pose–space trajectories
for the same object; thus, increasing the pose-insensitivity of the merged symbol.
Once the secondary layer lexicons are built and merged (and the knowledge bases between
the primary and secondary layers frozen), the last step is to train the knowledge bases between the
secondary layer lexicons. This is done by entering single eyeball images from the training set,
segmenting and representing each image using the primary layer (as during training), carrying out a
W on each secondary lexicon, and recording the symbol co-occurrence counts for each secondary
layer knowledge base.
When all of this is done, the secondary to tertiary knowledge bases (and their inverses) are built
using the same method as described above for the primary to secondary knowledge bases. Except
this time, each training episode uses the entire set of eyeball images of each training set sequence.
The resulting tertiary lexicon symbols are then merged and the tertiary layer inter-lexicon know-
ledge bases are built. This completes development of the vision module.

3.5.4 How Is the Visual Module Used?

After all of the lexicons and knowledge bases of the visual module of Figure 3.9 are built, the
module is ready for use. This subsection brieﬂy sketches how it can be used.
Given a new frame of imagery in which the gaze controller has found a ﬁxation point,
the primary layer of the visual module segments and represents the attended object with expect-
ations; just as during the later phases of training and education. The symbols of the non-null
expectations of primary lexicons then transmit to other primary lexicons and to secondary layer
lexicons via the established knowledge bases. The other primary layer and the secondary layer

104 105 106 107 108 109 110 111 112 113 114