Page 116 - Rapid Learning in Robotics
P. 116

102                                       Application Examples in the Vision Domain


                          7.3 Low Level Vision Domain: a Finger Tip Lo-

                                 cation Finder


                          So far, we have been investigating PSOMs for learning tasks in the context
                          of well pre-processed data representing clearly defined values and quanti-
                          ties. In the vision domain, those values are results of low level processing
                          stages where one deals with extremely high-dimensional data. In many
                          cases, it is doubtful to what extent smoothness assumptions are valid at
                          all.
                             Still, there are many situations in which one would like to compute
                          from an image some low-dimensional parameter vector, such as a set of
                          parameters describing location, orientation or shape of an object, or prop-
                          erties of the ambient illumination etc. If the image conditions are suitably
                          restricted, the input images may be samples that are represented as vec-
                          tors in a very high dimensional vector space, but that are concentrated on
                          a much lower dimensional sub-manifold, the dimensionality of which is
                          given by the independently varying parameters of the image ensemble.
                             A frequently occurring task of this kind is to identify and mark a par-
                          ticular part of an object in an image, as we already met in the previous
                          example for determination of the cube corners. For further example, in
                          face recognition it is important to identify the locations of salient facial
                          features, such as eyes or the tip of the nose. Another interesting task is to
                          identify the location of the limb joints of humans for analysis of body ges-
                          tures. In the following, we want to report from a third application domain,
                          the identification of finger tip locations in images of human hands (Walter
                          and Ritter 1996d). This would constitute a useful preprocessing step for
                          inferring 3 D-hand postures from images, and could help to enhance the
                          accuracy and robustness of other, more direct approaches to this task that
                          are based on LLM-networks (Meyering and Ritter 1992).

                             For the results reported here, we used a restricted ensemble of hand
                          postures. The main degree of freedom of a hand is its degree of “closure”.
                          Therefore, for the initial experiments we worked with an image set com-
                          prising grips in which all fingers are flexed by about the same amount,
                          varying from fully flexed to fully extended. In addition, we consider ro-
                          tation of the hand about its arm axis. These two basic degrees of freedom
                          yield a two-dimensional image ensemble (i.e., for the dimension m of the

                          map manifold we have m   ). The objective is to construct a PSOM that
   111   112   113   114   115   116   117   118   119   120   121