Page 159 - Biomimetics : Biologically Inspired Technologies
P. 159
Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c004 Final Proof page 145 21.9.2005 9:37am
Evolutionary Robotics and Open-Ended Design Automation 145
simulator has not yet been constructed. It is unlikely that one could be constructed, given the
chaotic nature of machine dynamics and their sensitivity to initial conditions and many small
parameter variations. Even if such simulators existed, creating accurate models would be pains-
takingly difficult, or may be impossible if the target environment is not perfectly known.
An alternative approach to ‘‘crossing the reality gap’’ is to use a crude simulator that captures the
salient features of the search space. Techniques have been developed for creating such simulators
and using noise to cover uncertainties so that the evolved controllers do not exploit these uncer-
tainties (Jakobi, 1997). Yet another approach is to use plasticity in the controller: allow the robot to
learn and adapt in reality. In nature, animals are born with mostly predetermined bodies and brains,
but these have some ability to learn and make final adaptations to whatever actual conditions may
arise.
A third approach is to coevolve simulators so that they are increasingly predictive. Just as we use
evolution to design a controller, we can use evolution to design the simulator so that it captures the
important properties of the target environment. Assume we have a rough simulator of the target
morphology, and we use it to evolve controllers in simulation. We then take the best controller and
try it — once — on the target system. If successful, we are done; but if the controller did not
produce the anticipated result (as is likely to happen since the initial simulator was crude), then we
observed some unexpected sensory data. We then evolve a new set of simulators, whose fitness is
their ability to reproduce the actual observed behavior when the original controller is tested on
them. Simulators that correctly reproduce the observed data are more likely to be predictive in the
future. We then take the best simulator, and use it to evolve a new controller, and the cycle repeats.
If the controller works in reality, we are done. If it does not work as expected, we now have more
data to evolve better simulators, and so forth. The coevolution of controllers and simulators is not
necessarily computationally efficient, but it dramatically reduces the number of trials necessary on
the target system.
The coevolutionary process consists of two phases: evolving the controller (or whatever we are
trying to modify on the target system) — we call this the exploration phase. The second phase tries
to create a simulator, or model of the system — we call this the estimation phase. To illustrate the
estimation–exploration process, consider a target robot with some unknown, but critical, morpho-
logical parameters, such as mass distribution and sensory lag times. Fifty independent runs of the
algorithm were conducted against the target robot. Figure 4.8a shows the 50 series of 20 best
simulator modifications output after each pass through the estimation phase. Figure 4.8a makes
clear that for all 50 runs, the algorithm was better able to infer the time lags of the eight sensors than
the mass increases of the nine body parts. This is not surprising in that the sensors themselves
provide feedback about the robot. In other words, the algorithm automatically, and after only a few
target trials, deduces the correct time lags of the target robot’s sensors, but is less successful at
indirectly inferring the masses of the body parts using the sensor data. Convergence towards the
correct mass distribution can also be observed, but even with an approximate description of
the robot’s mass distribution, the simulator is improved enough to allow smooth transfer of
controllers from simulation to the target robot. Using the default, approximate simulation, there
is a complete failure of transferal: the target robot simply moves randomly, and achieves no
appreciable forward locomotion. It is interesting to note that the evolved simulators are not perfect;
they capture well only those aspects of the world that are important for accomplishing the task.
The exploration–estimation approach can be used for much more than transferring controllers to
robots — it could be used by the robot itself to estimate its own structure. This would be particularly
useful if the robot may undergo some damage that changes some of its morphology in unexpected
ways, or some aspect in its environment changes. As each controller action is taken, the actual
sensory data is compared to that predicted by the simulator, and new internal simulators are evolved
to be more predictive. These new simulators are then used to try out new, adapted controllers for the
new and unexpected circumstances. Figure 4.8b shows some results applying this process to design
controllers for a robot which undergoes various types of drastic morphological damage, like losing