Page 140 - Rapid Learning in Robotics
P. 140

126                                  “Mixture-of-Expertise” or “Investment Learning”


                                             Context  c

                                                                   ω   parameters
                                                                      or  weights

                                                                                   X
                                                                                     2
                                           X                       T-Box
                                            1

                          Figure 9.1: The T-BOX maps between different task variable sets within a certain
                          context ( c), describable by a set of parameters  .




                               that the “skill” can be acquired by a “transformation box” (“T-BOX”),
                                which is a suitable building block with learning capabilities; the T-BOX
                                is responsible for the multi-variate, continuous-valued mapping T
                                  x       x  , transforming between the two task-variable sets   x   and   x  .


                               the mapping “skill” T-BOX is internally modeled and determined by
                                a set of parameters   (which can be accessed from outside the “black
                                box”, which makes the T-BOX rather an open “white box”);



                               the correct parameterization   changes smoothly with the context of
                                the system;


                               the situational context can be observed and is associated with a set
                                of suitable sensor values  c (some of them are possibly expensive and
                                temporarily unavailable);


                               the context changes only from time to time, or on a much larger time
                                scale, than the time scale on which the task mapping T-BOX is em-
                                ployed.



                             The conventional approach is to consider the joined problem of learn-

                          ing the mapping from all relevant input values,  x     c to the desired output
                           x  . This leads to large, specialized networks. Their disadvantages are first,
                          the possible catastrophic interference (after-learning in a situated context
                          may effect other contexts in an uncontrolled way, see Sec. 3.2); and second,
                          their low modularity and re-usability.
   135   136   137   138   139   140   141   142   143   144   145