Page 159 - Biomimetics : Biologically Inspired Technologies
P. 159

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c004 Final Proof page 145 21.9.2005 9:37am




                    Evolutionary Robotics and Open-Ended Design Automation                      145

                    simulator has not yet been constructed. It is unlikely that one could be constructed, given the
                    chaotic nature of machine dynamics and their sensitivity to initial conditions and many small
                    parameter variations. Even if such simulators existed, creating accurate models would be pains-
                    takingly difficult, or may be impossible if the target environment is not perfectly known.
                      An alternative approach to ‘‘crossing the reality gap’’ is to use a crude simulator that captures the
                    salient features of the search space. Techniques have been developed for creating such simulators
                    and using noise to cover uncertainties so that the evolved controllers do not exploit these uncer-
                    tainties (Jakobi, 1997). Yet another approach is to use plasticity in the controller: allow the robot to
                    learn and adapt in reality. In nature, animals are born with mostly predetermined bodies and brains,
                    but these have some ability to learn and make final adaptations to whatever actual conditions may
                    arise.
                      A third approach is to coevolve simulators so that they are increasingly predictive. Just as we use
                    evolution to design a controller, we can use evolution to design the simulator so that it captures the
                    important properties of the target environment. Assume we have a rough simulator of the target
                    morphology, and we use it to evolve controllers in simulation. We then take the best controller and
                    try it — once — on the target system. If successful, we are done; but if the controller did not
                    produce the anticipated result (as is likely to happen since the initial simulator was crude), then we
                    observed some unexpected sensory data. We then evolve a new set of simulators, whose fitness is
                    their ability to reproduce the actual observed behavior when the original controller is tested on
                    them. Simulators that correctly reproduce the observed data are more likely to be predictive in the
                    future. We then take the best simulator, and use it to evolve a new controller, and the cycle repeats.
                    If the controller works in reality, we are done. If it does not work as expected, we now have more
                    data to evolve better simulators, and so forth. The coevolution of controllers and simulators is not
                    necessarily computationally efficient, but it dramatically reduces the number of trials necessary on
                    the target system.
                      The coevolutionary process consists of two phases: evolving the controller (or whatever we are
                    trying to modify on the target system) — we call this the exploration phase. The second phase tries
                    to create a simulator, or model of the system — we call this the estimation phase. To illustrate the
                    estimation–exploration process, consider a target robot with some unknown, but critical, morpho-
                    logical parameters, such as mass distribution and sensory lag times. Fifty independent runs of the
                    algorithm were conducted against the target robot. Figure 4.8a shows the 50 series of 20 best
                    simulator modifications output after each pass through the estimation phase. Figure 4.8a makes
                    clear that for all 50 runs, the algorithm was better able to infer the time lags of the eight sensors than
                    the mass increases of the nine body parts. This is not surprising in that the sensors themselves
                    provide feedback about the robot. In other words, the algorithm automatically, and after only a few
                    target trials, deduces the correct time lags of the target robot’s sensors, but is less successful at
                    indirectly inferring the masses of the body parts using the sensor data. Convergence towards the
                    correct mass distribution can also be observed, but even with an approximate description of
                    the robot’s mass distribution, the simulator is improved enough to allow smooth transfer of
                    controllers from simulation to the target robot. Using the default, approximate simulation, there
                    is a complete failure of transferal: the target robot simply moves randomly, and achieves no
                    appreciable forward locomotion. It is interesting to note that the evolved simulators are not perfect;
                    they capture well only those aspects of the world that are important for accomplishing the task.
                      The exploration–estimation approach can be used for much more than transferring controllers to
                    robots — it could be used by the robot itself to estimate its own structure. This would be particularly
                    useful if the robot may undergo some damage that changes some of its morphology in unexpected
                    ways, or some aspect in its environment changes. As each controller action is taken, the actual
                    sensory data is compared to that predicted by the simulator, and new internal simulators are evolved
                    to be more predictive. These new simulators are then used to try out new, adapted controllers for the
                    new and unexpected circumstances. Figure 4.8b shows some results applying this process to design
                    controllers for a robot which undergoes various types of drastic morphological damage, like losing
   154   155   156   157   158   159   160   161   162   163   164