Page 150 - Rapid Learning in Robotics

P. 150

136 “Mixture-of-Expertise” or “Investment Learning”

is visible in Tab. 9.2.

9.3.3 Factorize Learning: The 3 D Stereo Case

The next step is the generalization of the monocular visuo-motor map to
the stereo case of two independent movable cameras. Again, the Puma
robot is positioned behind the table and the entire scene is displayed on
two windows on a computer monitor. By mouse-pointing, the user can,
for example, select one point on the monitor and the position on a line ap-
pearing in the other window, to indicate a goal position for the robot end
effector, see Fig. 9.7. This requires to compute the transformation T be-
R
L

tween the combined pair of pixel coordinates u u u on the monitor
images and corresponding 3 D world coordinates x in the robot reference

frame — or alternatively — the corresponding six robot joint angles (6
DOF). Here we demonstrate an integrated solution, offering both solutions
with the same network (see also Walter and Ritter 1996b).

L
U
ref ω
Meta-PSOM L
L
2 weights
X
3
4
U T-PSOM
2 2 6
θ
54
Meta-PSOM
R ω
U ref R R
Figure 9.7: Rapid learning of the 3D visuo-motor coordination for two cameras.

The basis T-PSOM (m ) is capable of mapping to and from three coordinate
systems: Cartesian robot world coordinates, the robot joint angles (6-DOF), and
the location of the end-effector in coordinates of the two camera retinas. Since the
left and right camera can be relocated independently, the weight set of T-PSOM
is split, and parts L R are learned in two separate Meta-PSOMs (“L” and “R”).

The T-PSOM learns each individual basis mapping T j by visiting a rect-
angular grid set of end effector positions i (here a 3 3 3 grid in x of size

cm ) jointly with the joint angle tuple j and the location in cam-
L
R
era retina coordinates (2D in each camera) u u . Thus the training vectors

j
j

R
L

w a for the construction of the T-PSOM are the tuples x i i u u .
i i i

145 146 147 148 149 150 151 152 153 154 155