Page 147 - Rapid Learning in Robotics
P. 147

9.3 Examples                                                                           133


                 displayed on a monitor. With a mouse-click, a user can select on the mon-
                 itor some target point of the displayed table area. The goal is to move the
                 robot end effector to the indicated position on the table. This requires to
                 compute a transformation T    x    u between coordinates on the moni-
                 tor (or “camera retina” coordinates) and corresponding world coordinates
                  x in the frame of reference of the robot. This transformation depends on
                 several factors, among them the relative position between the robot and
                 the camera. The learning task (for the later stage) is to rapidly re-learn this
                 transformation whenever the camera has been repositioned.








                                                          U
                                                           ref
                                                              Meta-PSOM      ω
                                                                             weights
                                                                 U                    X
                                                                         T-PSOM

                             ξ
                             ref
                 Figure 9.6: Rapid learning of the 2D visuo-motor coordination for a camera in
                 changing locations. The basis T-PSOM is capable of mapping to (and from) the
                 Cartesian robot world coordinates  x, and the location of the end-effector (here
                 the wooden hand replica) in camera coordinates  u (see cross mark.) In the pre-
                 training phase, nine basis mappings are learned in prototypical camera locations
                 (chosen to lie on the depicted grid.) Each mapping gets encoded in the weight
                 parameters    of the T-PSOM and serves then, together with the system context
                 observation  u ref (here, e.g. the cone tip), as a training vector for the Meta-PSOM.



                     In other words, here, the T-PSOM has to represent the transformation
                 T    x    u with the camera position as the additional context. To apply the
                 previous scheme, we must first learn (“investment stage”) the mapping T
                 for a set of prototypical contexts, i.e., camera positions.
                     To keep the number of prototype contexts manageable, we reduce some
                 DOFs of the camera by requiring fixed focal length, camera tripod height,
                 and roll angle. To constrain the elevation and azimuth viewing angle, we

                                                                           x
                 choose one fixed land mark, or “fixation point”   f       i somewhere centered
                 in the region of interest. After repositioning the camera, its viewing angle
   142   143   144   145   146   147   148   149   150   151   152