Page 111 - Biomimetics : Biologically Inspired Technologies
P. 111

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 97 21.9.2005 11:40pm




                    Mechanization of Cognition                                                   97

                    object. These sentences are designed to convey to the blind person useful information about the
                    nature of the object and its visual attributes (information that can be extracted by the human
                    educator just by looking at the visual representation of the object).
                      To train the links from the vision module to the language module (every visual lexicon is
                    afforded a knowledge base to every phrase lexicon), the educator’s sentences are entered, in order,
                    into the word lexicons of the sentence modules (each of which represents one sentence — see
                    Figure 3.12); each sentence is parsed into phrases (see Section 3.4); and these phrases are
                    represented on the sentence summary lexicon of each sentence. Counts are accumulated between
                    the symbols active on the visual module’s tertiary lexicons and those active on the summary
                    lexicons. If the educator wishes to describe specific visual subcomponents of the object, they
                    may designate a local window in the eyeball image for each subcomponent and supply the
                    sentence(s) describing each such subcomponent. The secondary and tertiary lexicon symbols
                    representing the subcomponents within each image are then linked to the summary lexicons of
                    the associated sentences. Before being used in this application, all of the internal knowledge bases
                    of the language module have already been trained using a huge text training corpus.
                      After a sufficient number of education examples have been accumulated (as determined by final
                    performance — described below), the link use counts are converted into p(cjl) probabilities and
                    frozen. The knowledge bases from the visual module’s lexicons to all of the sentence summary
                    lexicons are then combined (so that the available long-range context can be exploited by a sentence
                    in any position in the sequence of sentences to be generated). The annotation system is now ready
                    for testing.
                      The testing phase is carried out by having a sighted evaluator walk down the street wearing the
                    system (yes, the idea is that the entire system is in the form of a pair of glasses!). As the visual
                    module selects and describes each object, knowledge link inputs are sent to the language module.
                    These inputs are used, much as in the example of Section 3.3: as context that drives formation of
                    a sentence (only now there is no starter). Using consensus building (and separate sentence starter
                    generator and sentence terminator subsystems — not shown in Figure 3.12 and not discussed here































                    Figure 3.12  Image text annotation. A simple example of linking a visual module with a (text) language module.
                    See text for description.
   106   107   108   109   110   111   112   113   114   115   116