Page 56 - Rapid Learning in Robotics
P. 56

42                                                           Artificial Neural Networks


                             The next step is to smooth the LLM-outputs of several neurons, in-
                          stead of considering one single neuron. This can be achieved by replac-
                          ing the “winner-takes-all” rule (Eq. 3.9) with a “winner-takes-most” or “soft-
                          max” mechanism. For example, by employing Eq. 3.6 in the index space

                          of lattice coordinates A. Here the distance to the best-match a in the neu-
                          ron index space determines the contribution of each neuron. The relative
                          width   controls how strong the distribution is smeared out, similarly to
                          the neighborhood function h   , but using a separate bell size.
                             This form of local linear map proved to be very successful in many ap-
                          plications, e.g. like the kinematic mapping for an industrial robot (Ritter,
                          Martinetz, and Schulten 1989; Walter and Schulten 1993). In time-series
                          prediction it was introduced in conjunction with the SOM (Walter, Ritter,
                          and Schulten 1990) and later with the Neural-Gas network (Walter 1991;
                          Martinetz et al. 1993). Wan (1993) won the Santa-Fee time-series contest
                          (series X part) with a network built of finite impulse response (“FIR”) ele-
                          ments, which have strong similarities to LLMs.
                             Considering the local mapping as an “expert” for a particular task sub-
                          domain, the LLM-extended SOM can be regarded as the precursor to the
                          architectural idea of the “mixture-of-experts” networks (Jordan and Jacobs
                          1994). In this idea, the competitive SOM network performs the gating of
                          the parallel operating, local experts. We will return to the mixture-of-experts
                          architecture in Chap. 9.
   51   52   53   54   55   56   57   58   59   60   61