Page 142 - Rapid Learning in Robotics

P. 142

128 “Mixture-of-Expertise” or “Investment Learning”

weights / parameters j determined (see Fig. 9.2, arrows (1)). It serves to-
gether with the context information c as a high-dimensional training data
vector for the META-BOX (2). During the investment learning phase the
META-BOX mapping is constructed, which can be viewed as the stage for
the collection of expertise in the suitably chosen prototypical contexts.

9.2.2 One-shot Adaptation Phase

New
Context (3) ω
Meta-Box
parameters
c (3) or weights
X 2
X T-Box
1
(4) (4)

Figure 9.3: The One-shot Adaptation Phase.

After the META-BOX has been trained, the task of adapting the “skill” to
a new system context is tremendously accelerated. Instead of any time-
consuming re-learning of the mapping T this adjustment now takes the
form of an immediate META-BOX T-BOX mapping or “one-shot adapta-
tion”. As illustrated in Fig. 9.3, the META-BOX maps a new (unknown)
c for the
context observation new (3) into the parameter
weight set new
T-BOX. Equipped with new , the T-BOX provides the desired mapping
(4).
T new

9.2.3 “Mixture-of-Expertise” Architecture

It is interesting to compare this approach with a feed-forward architec-
ture which Jordan and Jacobs (1994) coined “mixture-of-experts”. As il-
lustrated in Fig. 9.4 a number of “experts” receive the same input task
variables together with the context information c. In parallel, each ex-
pert produces an output and contributes – with an individual weight – to
the overall system result. All these weights are determined by the “gating
network”, based on the context information c (see also LLM discussion in
Sec. 3.8).

137 138 139 140 141 142 143 144 145 146 147