Page 150 - Rapid Learning in Robotics
P. 150
136 “Mixture-of-Expertise” or “Investment Learning”
is visible in Tab. 9.2.
9.3.3 Factorize Learning: The 3 D Stereo Case
The next step is the generalization of the monocular visuo-motor map to
the stereo case of two independent movable cameras. Again, the Puma
robot is positioned behind the table and the entire scene is displayed on
two windows on a computer monitor. By mouse-pointing, the user can,
for example, select one point on the monitor and the position on a line ap-
pearing in the other window, to indicate a goal position for the robot end
effector, see Fig. 9.7. This requires to compute the transformation T be-
R
L
tween the combined pair of pixel coordinates u u u on the monitor
images and corresponding 3 D world coordinates x in the robot reference
frame — or alternatively — the corresponding six robot joint angles (6
DOF). Here we demonstrate an integrated solution, offering both solutions
with the same network (see also Walter and Ritter 1996b).
L
U
ref ω
Meta-PSOM L
L
2 weights
X
3
4
U T-PSOM
2 2 6
θ
54
Meta-PSOM
R ω
U ref R R
Figure 9.7: Rapid learning of the 3D visuo-motor coordination for two cameras.
The basis T-PSOM (m ) is capable of mapping to and from three coordinate
systems: Cartesian robot world coordinates, the robot joint angles (6-DOF), and
the location of the end-effector in coordinates of the two camera retinas. Since the
left and right camera can be relocated independently, the weight set of T-PSOM
is split, and parts L R are learned in two separate Meta-PSOMs (“L” and “R”).
The T-PSOM learns each individual basis mapping T j by visiting a rect-
angular grid set of end effector positions i (here a 3 3 3 grid in x of size
cm ) jointly with the joint angle tuple j and the location in cam-
L
R
era retina coordinates (2D in each camera) u u . Thus the training vectors
j
j
R
L
w a for the construction of the T-PSOM are the tuples x i i u u .
i i i