Page 106 - Biomimetics : Biologically Inspired Technologies
P. 106
Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 92 21.9.2005 11:40pm
92 Biomimetics: Biologically Inspired Technologies
Once the primary lexicon symbol sets are developed, the next step is to develop the knowledge
bases between these lexicons. For simplicity, we can assume that every primary visual lexicon is
connected to every other by a knowledge base.
The primary visual layer (i.e., the primary visual lexicons and the knowledge bases linking
them) knowledge bases are trained using large quantities of new video gathered from the oper-
ational source, with the gaze controller selecting fixation points. Again, it is somehow arranged that
each eyeball image contains only an object of operational interest at the fixation point and no visual
elements of other objects (i.e., the rest of the eyeball image is blank).
As each eyeball image vector U is created and its selected subsidiary components (making up
the 81 primary visual lexicon input vectors) are sent to the primary visual lexicons, each lexicon
expresses an expectation with the, say, 10 symbols whose associated codebook vectors lie closest to
its input vector. Count accumulation then takes place for all (unidirectional) links between pairs of
these expectation symbols lying on different lexicons.
The idea of using the ten closest symbols is based upon the discovery (Caid and Hecht-
Nielsen, 2001, 2004) that jet correlation vectors which are near to one another in the Euclidean
metric (i.e., in the VQ space of a lexicon) represent local visual appearances that are (to a human
observer) visually similar to each other; AND VICE VERSA. This valuable fact was pointed out
in the 1980s by John Daugman (Daugman, 1985, 1987, 1988a,b; Daugman and Kammen, 1987)
(Daugman also invented the iris scan biometric signature). This way, symbols which could
reasonably occur together meaningfully within the same object become linked. This is much
more efficient and effective than if each lexicon simply expressed the one closest symbol;
and yet, because of Daugman’s important principle, no harm can come of this expansion to
multiple symbols. The key point is that counts are kept between each of the combinatorially
many ordered excited symbol pairs (of symbols on different lexicons) involved. The process of
deriving the p(cjl) knowledge link strengths ensures that only the meaningful links are retained
in the end.
As training progresses, the p(cjl) knowledge link strengths are periodically calculated from the
symbol co-occurrence count matrices (of which there is one for each knowledge base). When the
meaningful p(cjl) values stop changing much, training is ended. The primary visual layer is now
complete.
3.5.3 Building the Secondary and Tertiary Visual Layers
After completion of the primary visual layer, it is time to build the secondary and tertiary visual
layers. However, this process again requires that the primary visual layer representation of each
eyeball image pertain to only one object — which can now be accomplished using the primary
layer’s knowledge bases, as described next.
Figure 3.11 shows a portion of a frame from the wide-angle high-resolution panchromatic video
camera containing an eyeball image that has been selected by the gaze controller. Each of the 81
primary visual lexicons shown is receiving its input vector from this eyeball image. The first thing
that happens is that each lexicon expresses an expectation consisting of those (again, say, 10)
symbols which were closest to that lexicon’s input vector. (Note: This is similar to a C1F effect,
except that the inputs are not coming from knowledge links, but from ‘‘extra-cortical sensory
afferents.’’ This illustrates, as does the handling of the S vector by primary sound lexicons discussed
in Section 3.4, how the handling of these special external sensory inputs is very similar to the
handling of knowledge link inputs.)
Once the primary visual lexicon expectations are established, knowledge links proceeding from
the central lexicon of the primary layer, and its immediate neighboring lexicons, outward are
enabled (allowing all symbols of all expectations of those lexicons to transmit) and the distal
lexicons that these links target receive C1F commands. Those distal lexicons that do not receive
links to symbols of their (previously established and frozen) expectations describing their portion of