Page 143 - Dynamic Vision for Perception and Control of Motion
P. 143

5.1 Visual Features      127


            ing on the aspect conditions. The shape invariance in this case can be captured only
            by using 3-D shape models and models for mapping by central projection. This ap-
            proach is much better suited for visually recognizing the environment during ego–
            motion and for tracking other (massive) objects over time, than for single snapshot
            interpretation. This is true since massive bodies move smoothly over time, and in-
            variance properties with respect to time, such as eigen–frequencies, damping, and
            stereotypic motion characteristics (like style of walking), may be exploited as
            knowledge about specific objects/subjects in the real world. Therefore, embedding
            the image analysis task in a temporal continuum and exploiting  known motion
            characteristics in an object-oriented way will alleviate the image sequence interpre-
            tation task (extended idea of gestalt). It requires, however, that the internal repre-
            sentation be in four dimensions right from the beginning: in 3-D space and time for
            single objects. This is the essence of the 4-D approach to dynamic machine vision
            developed in the early 1980s [Meissner, Dickmanns 1983; Dickmanns 1987; Wünsche
            1987].
              By embedding (a) simple feature extraction with linear edge elements, (b) re-
            gions with linear shading models, and (c) horizontal and vertical image stripes into
            this framework of spatio–temporal object orientation, these methods gain consid-
            erably in power and useful range of application. By exploiting a knowledge base
            on dynamic motion derived from previous experience in observing motion proc-
            esses in 3-D space of specific 3-D objects carrying highly visible features on their
            surface, scene understanding is considerably alleviated. Specific groups of linearly
            extended edge feature sets and adjacent homogeneous areas of gray, color (or in
            the future texture) values are interpreted as originating from these spatial objects
            under specific aspect conditions. This background may also be one of the reasons,
            beside their robustness to changing lighting conditions, that in highly developed
            biological vision systems (like the mammalian ones)  edge element operators
            abound [Hubel, Wiesel 1962;  Koenderink, van Doorn 1990]. Without 3-D invariance
            and without knowledge about motion processes and about perspective projection
            (implicit or explicit), the situation would be quite different with respect to the use-
            fulness of these operators.
              The edge-based approach has an advantage over region-based approaches if in-
            variance under varying lighting conditions is considered. Even though the intensi-
            ties and color values may change differently with time in adjacent image regions,
            the position of the boundaries between them does not, and the edge will remain
            visible as the locus of highest intensity or color gradient. In natural environments,
            changing lighting conditions are more the rule than an exception. Therefore, Sec-
            tion 5.2 will be devoted to edge-based methods.
              However, for robust interpretation of complex images, region-based image
            evaluation is advantageous. Since today's processors do not allow full scale area-
            based processing of images in real time, a compromise has to be sought.  Some as-
            pects of region-based image evaluation may be exploited by confining the regional
            operations to the vicinity of edges. This is done in conjunction with the edge-based
            approach, and it can be of help in establishing feature correspondence for object
            recognition using a knowledge base and in detecting occlusions by other objects.
              A second step toward including area-based information in the 4-D scheme under
            the constraint  of limited computing power is to confine the evaluation areas to
   138   139   140   141   142   143   144   145   146   147   148