Page 143 - Dynamic Vision for Perception and Control of Motion

P. 143

5.1 Visual Features 127

ing on the aspect conditions. The shape invariance in this case can be captured only
by using 3-D shape models and models for mapping by central projection. This ap-
proach is much better suited for visually recognizing the environment during ego–
motion and for tracking other (massive) objects over time, than for single snapshot
interpretation. This is true since massive bodies move smoothly over time, and in-
variance properties with respect to time, such as eigen–frequencies, damping, and
stereotypic motion characteristics (like style of walking), may be exploited as
knowledge about specific objects/subjects in the real world. Therefore, embedding
the image analysis task in a temporal continuum and exploiting known motion
characteristics in an object-oriented way will alleviate the image sequence interpre-
tation task (extended idea of gestalt). It requires, however, that the internal repre-
sentation be in four dimensions right from the beginning: in 3-D space and time for
single objects. This is the essence of the 4-D approach to dynamic machine vision
developed in the early 1980s [Meissner, Dickmanns 1983; Dickmanns 1987; Wünsche
1987].
By embedding (a) simple feature extraction with linear edge elements, (b) re-
gions with linear shading models, and (c) horizontal and vertical image stripes into
this framework of spatio–temporal object orientation, these methods gain consid-
erably in power and useful range of application. By exploiting a knowledge base
on dynamic motion derived from previous experience in observing motion proc-
esses in 3-D space of specific 3-D objects carrying highly visible features on their
surface, scene understanding is considerably alleviated. Specific groups of linearly
extended edge feature sets and adjacent homogeneous areas of gray, color (or in
the future texture) values are interpreted as originating from these spatial objects
under specific aspect conditions. This background may also be one of the reasons,
beside their robustness to changing lighting conditions, that in highly developed
biological vision systems (like the mammalian ones) edge element operators
abound [Hubel, Wiesel 1962; Koenderink, van Doorn 1990]. Without 3-D invariance
and without knowledge about motion processes and about perspective projection
(implicit or explicit), the situation would be quite different with respect to the use-
fulness of these operators.
The edge-based approach has an advantage over region-based approaches if in-
variance under varying lighting conditions is considered. Even though the intensi-
ties and color values may change differently with time in adjacent image regions,
the position of the boundaries between them does not, and the edge will remain
visible as the locus of highest intensity or color gradient. In natural environments,
changing lighting conditions are more the rule than an exception. Therefore, Sec-
tion 5.2 will be devoted to edge-based methods.
However, for robust interpretation of complex images, region-based image
evaluation is advantageous. Since today's processors do not allow full scale area-
based processing of images in real time, a compromise has to be sought. Some as-
pects of region-based image evaluation may be exploited by confining the regional
operations to the vicinity of edges. This is done in conjunction with the edge-based
approach, and it can be of help in establishing feature correspondence for object
recognition using a knowledge base and in detecting occlusions by other objects.
A second step toward including area-based information in the 4-D scheme under
the constraint of limited computing power is to confine the evaluation areas to

138 139 140 141 142 143 144 145 146 147 148