Page 116 - Rapid Learning in Robotics
P. 116
102 Application Examples in the Vision Domain
7.3 Low Level Vision Domain: a Finger Tip Lo-
cation Finder
So far, we have been investigating PSOMs for learning tasks in the context
of well pre-processed data representing clearly defined values and quanti-
ties. In the vision domain, those values are results of low level processing
stages where one deals with extremely high-dimensional data. In many
cases, it is doubtful to what extent smoothness assumptions are valid at
all.
Still, there are many situations in which one would like to compute
from an image some low-dimensional parameter vector, such as a set of
parameters describing location, orientation or shape of an object, or prop-
erties of the ambient illumination etc. If the image conditions are suitably
restricted, the input images may be samples that are represented as vec-
tors in a very high dimensional vector space, but that are concentrated on
a much lower dimensional sub-manifold, the dimensionality of which is
given by the independently varying parameters of the image ensemble.
A frequently occurring task of this kind is to identify and mark a par-
ticular part of an object in an image, as we already met in the previous
example for determination of the cube corners. For further example, in
face recognition it is important to identify the locations of salient facial
features, such as eyes or the tip of the nose. Another interesting task is to
identify the location of the limb joints of humans for analysis of body ges-
tures. In the following, we want to report from a third application domain,
the identification of finger tip locations in images of human hands (Walter
and Ritter 1996d). This would constitute a useful preprocessing step for
inferring 3 D-hand postures from images, and could help to enhance the
accuracy and robustness of other, more direct approaches to this task that
are based on LLM-networks (Meyering and Ritter 1992).
For the results reported here, we used a restricted ensemble of hand
postures. The main degree of freedom of a hand is its degree of “closure”.
Therefore, for the initial experiments we worked with an image set com-
prising grips in which all fingers are flexed by about the same amount,
varying from fully flexed to fully extended. In addition, we consider ro-
tation of the hand about its arm axis. These two basic degrees of freedom
yield a two-dimensional image ensemble (i.e., for the dimension m of the
map manifold we have m ). The objective is to construct a PSOM that