Page 202 - Mechatronics for Safety, Security and Dependability in a New Era
P. 202
Ch39-I044963.fm Page 186 Tuesday, August 1, 2006 3:15 PM
Ch39-I044963.fm
186
186 Page 186 Tuesday, August 1, 2006 3:15 PM
RELATED WORKS
While simple mobile robot behaviors can be learned with feed-forward neural networks, combinations
of behaviors, where sometimes identical sensory inputs should trigger different actions, require
additional coordinating mechanisms. For example, in Calabretta, Nolfi, Parisi, & Wagner (1998) a
Khepera robot is trained to perform a garbage collecting task and the authors find a correspondence
between specific behaviors and the evolved neural network modules. The interaction among these
modules is controlled by selector neurons that give precedence of a given module over the others.
In contrast to the above work, where the modules are physically separate entities, Ziemke (2000)
interprets the trained Recurrent Neural Network (RNN) as a diachvonically structured controller. In
this case, instead of modules existing separately at the same time, a monolithic neural network
instantiates different input-output mappings at various time points. An important aspect of the
mechanism by which RNN achieve modularity is discussed in Cohen, Dunbar, & McClellandl (1990),
where the switching between two input-output mappings is achieved by attentional control (attention is
viewed as "an additional source of input that provides contextual support for the processing of signals
within a selected pathway" (p. 335)). In RNN, the source that provides contextual support favoring
one of the competing input-output mappings is the context layer. The state maintained in the context
layer disambiguates the inputs and thus different outputs can be obtained for similar inputs.
Since, in RNN, the internal state plays a central role in switching between the alternative input-output
mappings, the flexibility of updating and maintaining this internal state affects directly the flexibility
of the resulting robot behaviors implemented by the network. The potential of the computational
model of working memory based on the PFC and basal ganglia (PBWM model), proposed in O'Reilly
& Frank (2004), to provide such flexibility motivated us to investigate its application to learning
combinations of robot behaviors.
APPROACH
In the presented approach, the PBWM model is used to implement several possible input output
mappings and then to learn specific combinations. Also, a model of the environment is added to
provide model-generated experience. We are interested in two consequences of using an
environment model: lowering the costs associated with actually performing the actions and extending
the neural network model to a planning system supporting grounded representations.
Working Memory Model
Here we present an outline of the PBWM model (refer to O'Reilly & Frank (2004) for details). The
model implementation is based on the Leabra framework (O'Reilly & Munakata, 2000), uses point
neuron activation function for modelling the neurons, k-Winners-Take-All inhibition to model
competition among the neurons in a layer, and a combination of Hebbian and error-driven learning.
The neural network structure (Figure lc) consists of two groups of layers. The first group includes
the Input, Hidden, Output, Nextlnput, and PFC layers. The Nextlnput layer is used for the
environment model and will be explained later. The Input, Hidden, and Output layers form a
standard three-layer neural network structure. The PFC layer is an improved context layer, which is
bi-directionally connected with the Hidden layer, and influences the input-output pathways. The
PFC layer is divided into stripes to allow independent control over the updating and maintenance of
parts of the activation state. The rest of the layers form the second group, which implements a gating
mechanism for control over the updating and maintenance of the PFC activation state. Generally, a
positive reward leads to stabilizing of the current PFC activation state, while a negative reward results
in updating (a part of it) and establishing of another state.