Page 144 - Rapid Learning in Robotics
P. 144
130 “Mixture-of-Expertise” or “Investment Learning”
The lower part of Fig. 9.4 redraws the proposed hierarchical network
scheme and suggests to name it “mixture-of-expertise”. In contrast to the
specialized “experts” in Jordan's picture, here, one single “expert” gathers
specialized “expertise” in a number of prototypical context situations (see
investment learning phase, Sec. 9.2.1). The META-BOX is responsible for the
non-linear “mixture” of this “expertise”.
With respect to networks' requirements for memory and computation,
the “mixture-of-expertise” architecture compares favorably: the “exper-
tise” ( ) is gained and implemented in a single “expert” network (T-BOX).
Furthermore, the META-BOX needs to be re-engaged only when the con-
c
text is changed, which is indicated by a deviating sensor observation .
However, this scheme requires from the learning implementation of
the T-BOX that the parameter (or weight) set is represented as a con-
tinuous function of the context variables c. Furthermore, different “de-
generate” solutions must be avoided: e.g. a regular multilayer perceptron
allows many weight permutations to achieve the same mapping. Em-
ploying a MLP in the T-BOX would result in grossly inadequate interpo-
lation between prototypical “expertises” j , denoted in different kinds of
permutations. Here, a suitable stabilizer would be additionally required.
Please note, that the new “mixture-of-expertise” scheme does not only
identify the context and retrieve a suitable parameter set (association).
Rather it achieves a high-dimensional generalization of the learned (in-
vested) situations to new, previously unknown contexts.
A “mixture-of-expertise” aggregate can serve as an expert module in
a hierarchical structure with more than two levels. Moreover, the two ar-
chitectures can be certainly combined. This is particularly advantageous
when very complex mappings are smooth in certain domains, but non-
continuous in others. Then, different types of learning experts, like PSOMs,
Meta-PSOMs, LLMs, RBF and others can be chosen. The domain weight-
ing can be controlled by a competitive scheme, e.g. RBF, LVQ, SOM, or a
“Neural-Gas” network (see Chap. 3).
9.3 Examples
The concept imposes a strong need for efficient learning algorithms: to
keep the number of required training examples manageable, those should