Page 56 - Rapid Learning in Robotics
P. 56
42 Artificial Neural Networks
The next step is to smooth the LLM-outputs of several neurons, in-
stead of considering one single neuron. This can be achieved by replac-
ing the “winner-takes-all” rule (Eq. 3.9) with a “winner-takes-most” or “soft-
max” mechanism. For example, by employing Eq. 3.6 in the index space
of lattice coordinates A. Here the distance to the best-match a in the neu-
ron index space determines the contribution of each neuron. The relative
width controls how strong the distribution is smeared out, similarly to
the neighborhood function h , but using a separate bell size.
This form of local linear map proved to be very successful in many ap-
plications, e.g. like the kinematic mapping for an industrial robot (Ritter,
Martinetz, and Schulten 1989; Walter and Schulten 1993). In time-series
prediction it was introduced in conjunction with the SOM (Walter, Ritter,
and Schulten 1990) and later with the Neural-Gas network (Walter 1991;
Martinetz et al. 1993). Wan (1993) won the Santa-Fee time-series contest
(series X part) with a network built of finite impulse response (“FIR”) ele-
ments, which have strong similarities to LLMs.
Considering the local mapping as an “expert” for a particular task sub-
domain, the LLM-extended SOM can be regarded as the precursor to the
architectural idea of the “mixture-of-experts” networks (Jordan and Jacobs
1994). In this idea, the competitive SOM network performs the gating of
the parallel operating, local experts. We will return to the mixture-of-experts
architecture in Chap. 9.