Page 141 - Dynamic Vision for Perception and Control of Motion

P. 141

5.1 Visual Features 125

The results of all of these single-object–recognition processes have to be pre-
sented to the situation assessment level in unified form so that relative motion be-
tween objects and movements of subjects can be appreciated on a larger spatial and
temporal scale. The dynamic object database (DOB) solves this task. On the situa-
tion level, working on huge volumes of image data is no longer possible. There-
fore, the DOB also serves the purpose of presenting the scene recognized in an ob-
ject-oriented, symbolic way. Figure 5.1 shows the three levels for image sequence
processing and understanding. The results of the right-hand branch from level 1 are
fed into this scheme to provide background information on the lighting and other
environmental conditions.
The situation to be assessed on the decision level has to include all of this and
the trajectory planned for the subject body in the near future. Both safety aspects
and mission goals have to be taken into account here; a selection has to be made
between more and less relevant objects/subjects by judging hazard potentials from
their trajectories/behaviors. This challenge will be discussed in Chapter 13. Figure
5.1 visualizing the stages mentioned for visual dynamic scene interpretation will be
discussed in more detail after the foundations for feature extraction and ob-
ject/subject recognition have been laid down.

5.1 Visual Features

The discussion of the topic of feature extraction will be done here in an exemplary
fashion only for road scenes. Other domains may require different feature sets;
however, edge and corner features are very robust types of features under a wide
range of varying lighting and aspect conditions in many domains. Additional fea-
ture sets are gray value or color blobs, certain intensity or color patterns, and tex-
tures. The latter cover a wide range; they are very computer-intensive, in general.
In biological vertebrate vision, edge features of different size and under different
orientations are one of the first stages of visual processing (in V1 [Hubel and Wiesel
1962]). There are many algorithms available for extracting these features (see
[Duda, Hart 1973; Ballard, Brown 1982; Canny 1983; http://iris.usc.edu/Vision-
Notes/bibliography/contents.html]. A very efficient algorithm especially suited for
road-scene analysis has been developed by Kuhnert (1988) and Mysliwetz (1990).
Search directions or patterns are also important for efficient feature extraction. A
version of this well-proven algorithm, the workhorse of the 4-D approach over two
decades, will be discussed in detail in Section 5.2. Computing power in the 1980s
did not allow more computer-intensive features for real-time applications at that
time. Now that four orders of magnitude in computing power per microprocessor
have been gained and are readily available, a more general feature extraction
method dubbed “UBM”, the basic layout of which has been developed by Hof-
mann (2004) and the author, will be discussed in Section 5.3. It unifies the extrac-
tion of the following features in a single pass: Nonplanar regions of the image in-
tensity function, linearly shaded blobs, edges of any orientation, and corners.

136 137 138 139 140 141 142 143 144 145 146