Page 147 - Rapid Learning in Robotics
P. 147
9.3 Examples 133
displayed on a monitor. With a mouse-click, a user can select on the mon-
itor some target point of the displayed table area. The goal is to move the
robot end effector to the indicated position on the table. This requires to
compute a transformation T x u between coordinates on the moni-
tor (or “camera retina” coordinates) and corresponding world coordinates
x in the frame of reference of the robot. This transformation depends on
several factors, among them the relative position between the robot and
the camera. The learning task (for the later stage) is to rapidly re-learn this
transformation whenever the camera has been repositioned.
U
ref
Meta-PSOM ω
weights
U X
T-PSOM
ξ
ref
Figure 9.6: Rapid learning of the 2D visuo-motor coordination for a camera in
changing locations. The basis T-PSOM is capable of mapping to (and from) the
Cartesian robot world coordinates x, and the location of the end-effector (here
the wooden hand replica) in camera coordinates u (see cross mark.) In the pre-
training phase, nine basis mappings are learned in prototypical camera locations
(chosen to lie on the depicted grid.) Each mapping gets encoded in the weight
parameters of the T-PSOM and serves then, together with the system context
observation u ref (here, e.g. the cone tip), as a training vector for the Meta-PSOM.
In other words, here, the T-PSOM has to represent the transformation
T x u with the camera position as the additional context. To apply the
previous scheme, we must first learn (“investment stage”) the mapping T
for a set of prototypical contexts, i.e., camera positions.
To keep the number of prototype contexts manageable, we reduce some
DOFs of the camera by requiring fixed focal length, camera tripod height,
and roll angle. To constrain the elevation and azimuth viewing angle, we
x
choose one fixed land mark, or “fixation point” f i somewhere centered
in the region of interest. After repositioning the camera, its viewing angle