Page 112 - Rapid Learning in Robotics
P. 112

98                                        Application Examples in the Vision Domain


                          freedom (3+3 DOF translation + rotation). We split the problem in two
                                                                  1
                          sub-tasks: (i) in locating and tracking the object center (which is a 2 DOF
                          problem), and (ii) finding the orientation together with the depth of the
                          object seen. This factorization is advantageous in order to reduce the num-
                          ber of training examples needed.


                             z                  φ                  θ                   ψ
                                         0.5                 0.5                0.5                 0.5
                                         0 u_y               0 u_y              0 u_y               0 u_y
                                         -0.5                -0.5               -0.5                -0.5
                          a)   -0.5  0  0.5   b)  -0.5  0  0.5    c)  -0.5  0  0.5   d)   -0.5  0  0.5
                                   u_x                u_x                 u_x                u_x

                          Figure 7.2: The     z    system. (a) The cubical test object seen by the camera





                          when rotated and shifted in several depths z ( =10 ,  =20 ,  =30 , z=2...6L, cube



                          size L.) (b–d) 0 ,20 , and 30 rotations in the roll  , pitch  , and yaw   system.
                          (The transformations are applied from right to left             z.)
                             Here we demonstrate a solution of the second, the part involving four
                          independent variables (DOFs). Eight virtual sensors detect the corners of
                          a test cube, seen in a perspective camera view of this object. Fig. 7.2 il-
                          lustrates the parameterization of the object pose in the depths z and the
                          three unit rotations in the roll  , pitch  , and yaw   angle system (see
                          e.g. Fu et al. 1987). The embedding space X is spanned by the variables
                          x                 z            u   P           where  u P is the signal of sensor i,







                                               u P
                                          u P
                                                                                i
                          here the image coordinate pair of point P i .
                             Tab. 7.1 presents the achieved accuracy for recovering the object pose
                          for a variety of experimental set-up parameters. Various ranges and train-
                          ing vector numbers demonstrate how the precision to which identification
                          is achieved depends on the range of the involved free variables and the
                          number of data vectors which are invested for training.
                             With the same trained PSOM, the system can predict locations of oc-
                          cluded object parts when a sufficient number of points has already been
                          found. These hypotheses could be fed back to the perceptual processing
                          stages, e.g. to “look closer” on the predicted sub-regions of the image

                             1 Walter and Ritter (1996e) describes the experiment of Cartesian tracking of a 3 D free
                          movable sphere. The ball is presented by a human and real-time tracked in 3 D by the
                          Puma robot using the end-effector based vision system (camera-in-hand configuration).
   107   108   109   110   111   112   113   114   115   116   117