Page 112 - Rapid Learning in Robotics

P. 112

98 Application Examples in the Vision Domain

freedom (3+3 DOF translation + rotation). We split the problem in two
1
sub-tasks: (i) in locating and tracking the object center (which is a 2 DOF
problem), and (ii) ﬁnding the orientation together with the depth of the
object seen. This factorization is advantageous in order to reduce the num-
ber of training examples needed.

z φ θ ψ
0.5 0.5 0.5 0.5
0 u_y 0 u_y 0 u_y 0 u_y
-0.5 -0.5 -0.5 -0.5
a) -0.5 0 0.5 b) -0.5 0 0.5 c) -0.5 0 0.5 d) -0.5 0 0.5
u_x u_x u_x u_x

Figure 7.2: The z system. (a) The cubical test object seen by the camera

when rotated and shifted in several depths z ( =10 , =20 , =30 , z=2...6L, cube

size L.) (b–d) 0 ,20 , and 30 rotations in the roll , pitch , and yaw system.
(The transformations are applied from right to left z.)
Here we demonstrate a solution of the second, the part involving four
independent variables (DOFs). Eight virtual sensors detect the corners of
a test cube, seen in a perspective camera view of this object. Fig. 7.2 il-
lustrates the parameterization of the object pose in the depths z and the
three unit rotations in the roll , pitch , and yaw angle system (see
e.g. Fu et al. 1987). The embedding space X is spanned by the variables
x z u P where u P is the signal of sensor i,

u P
u P
i
here the image coordinate pair of point P i .
Tab. 7.1 presents the achieved accuracy for recovering the object pose
for a variety of experimental set-up parameters. Various ranges and train-
ing vector numbers demonstrate how the precision to which identiﬁcation
is achieved depends on the range of the involved free variables and the
number of data vectors which are invested for training.
With the same trained PSOM, the system can predict locations of oc-
cluded object parts when a sufﬁcient number of points has already been
found. These hypotheses could be fed back to the perceptual processing
stages, e.g. to “look closer” on the predicted sub-regions of the image

1 Walter and Ritter (1996e) describes the experiment of Cartesian tracking of a 3 D free
movable sphere. The ball is presented by a human and real-time tracked in 3 D by the
Puma robot using the end-effector based vision system (camera-in-hand conﬁguration).

107 108 109 110 111 112 113 114 115 116 117