Page 134 - Biomimetics : Biologically Inspired Technologies
P. 134

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c003 Final Proof page 120 21.9.2005 11:41pm




                    120                                     Biomimetics: Biologically Inspired Technologies


































                    Figure 3.A.8  Learning and using the precise associations from symbols to action commands within a single
                    cortical module. Keep in mind that the neuron populations involved in these associations, illustrated here as small
                    sets, are, in the brain, extremely large sets (tens of thousands of neurons in every case). See text for explanation of
                    the figure.

                    actions and then constructing an action hierarchy). At beginning of development of each module,
                    the first item on the agenda is development of the module’s symbols (which was discussed in
                    Section 3.A.3). As this lexicon development process begins to produce stable symbols, the problem
                    of associating these with actions is addressed.
                       At first, action command neurons are randomly triggered when a particular single symbol is
                    being expressed by the lexicon (i.e., that symbol was the lone outcome of a confabulation operation
                    by the module). As this occurs, the BG monitors the activity of this lexicon (via efferents from
                    Layers III and V — see Figure 3.A.8). When a randomly activated action command happens to
                    cause an action that the basal ganglia judge to be particularly ‘‘good’’ (meaning that a reduction in a
                    drive or goal level was observed — which the basal ganglia know about because of their massive
                    input from the limbic system), that action is then associated with the currently expressed symbol via
                    the mechanism of Figure 3.A.8.
                       (Note: Reductions in drive and goal states are almost never immediate following an action. They
                    are usually delayed by seconds or minutes; sometimes by hours. One of the hypothesized functions
                    of the BG [Miyamoto et al., 2004] is that it develops a large number of predictive models, called
                    critics [Barto et al., 1983], that learn [via delayed reinforcement learning methods; Sutton and
                    Barto, 1998] to accurately predict the eventual goal-or-drive-state-reduction ‘‘value’’ or ‘‘worth’’ of
                    an action at the time the action is suggested or executed. It is by using such critic models that the BG
                    is hypothesized by the theory to immediately assess the worth of action commands produced by
                    Layer V outputs.)
                       When an action command that is randomly launched is indeed judged worthy of association
                    from the currently expressed symbol of a module, a special signal (the green arrow in Figure 3.A.8)
                    is sent (via thalamus) from the striatum of the BG to cortical Layer I of the module. This green
                    signal causes the synapses (blue circles) connecting axon collaterals of the neurons representing the
   129   130   131   132   133   134   135   136   137   138   139