Page 141 - Dynamic Vision for Perception and Control of Motion
P. 141

5.1 Visual Features      125


              The results of all of these single-object–recognition processes have to be pre-
            sented to the situation assessment level in unified form so that relative motion be-
            tween objects and movements of subjects can be appreciated on a larger spatial and
            temporal scale. The dynamic object database (DOB) solves this task. On the situa-
            tion level, working on huge volumes of image data is no longer possible. There-
            fore, the DOB also serves the purpose of presenting the scene recognized in an ob-
            ject-oriented, symbolic way. Figure 5.1 shows the three levels for image sequence
            processing and understanding. The results of the right-hand branch from level 1 are
            fed into this scheme to provide background information on the lighting and other
            environmental conditions.
              The situation to be assessed on the decision level has to include all of this and
            the trajectory planned for the subject body in the near future. Both safety aspects
            and mission goals have to be taken into account here; a selection has to be made
            between more and less relevant objects/subjects by judging hazard potentials from
            their trajectories/behaviors. This challenge will be discussed in Chapter 13. Figure
            5.1 visualizing the stages mentioned for visual dynamic scene interpretation will be
            discussed in  more detail after the foundations for feature extraction and  ob-
            ject/subject recognition have been laid down.


            5.1 Visual Features



            The discussion of the topic of feature extraction will be done here in an exemplary
            fashion  only for road scenes. Other  domains may require different feature sets;
            however, edge and corner features are very robust types of features under a wide
            range of varying lighting and aspect conditions in many domains. Additional fea-
            ture sets are gray value or color blobs, certain intensity or color patterns, and tex-
            tures. The latter cover a wide range; they are very computer-intensive, in general.
              In biological vertebrate vision, edge features of different size and under different
            orientations are one of the first stages of visual processing (in V1 [Hubel and Wiesel
            1962]). There  are many algorithms available for extracting these features  (see
            [Duda, Hart 1973; Ballard, Brown 1982; Canny 1983; http://iris.usc.edu/Vision-
            Notes/bibliography/contents.html]. A  very efficient algorithm especially suited  for
            road-scene analysis has been developed  by  Kuhnert (1988)  and Mysliwetz  (1990).
            Search directions or patterns are also important for efficient feature extraction. A
            version of this well-proven algorithm, the workhorse of the 4-D approach over two
            decades, will be discussed in detail in Section 5.2. Computing power in the 1980s
            did not allow more computer-intensive features for real-time applications at that
            time. Now that four orders of magnitude in computing power per microprocessor
            have  been  gained and are readily available, a  more general feature extraction
            method dubbed  “UBM”,  the  basic  layout of which has been developed  by Hof-
            mann (2004) and the author, will be discussed in Section 5.3. It unifies the extrac-
            tion of the following features in a single pass: Nonplanar regions of the image in-
            tensity function, linearly shaded blobs, edges of any orientation, and corners.
   136   137   138   139   140   141   142   143   144   145   146