Page 132 - Dynamic Vision for Perception and Control of Motion
P. 132
116 4 Application Domains, Missions, and Situations
Photometric appearance (Appendix A.4.4) can help in connection with the as-
pect conditions to find out the proper hypothesis. Intensity and color shading as
well as high resolution in texture discrimination contribute positively to eliminat-
ing false object hypotheses. Computing power and algorithms are becoming avail-
able now for using these region-based features efficiently. The last four sections
discussed are concerned with single object (vehicle) recognition based on image
sequence analysis. In our approach, this is done by specialist processes for certain
object classes (roads and lanes, other vehicles, landmarks, etc.).
When it comes to understanding the semantics of processes observed, the func-
tionality aspects (Appendix A.4.5) prevail. For proper recognition, observations
have to be based on spatially and temporally more extended representation. Trying
to do this with data-intensive images is not yet possible today, and maybe even not
desirable in the long run for data efficiency and corresponding delay times in-
volved. For this reason, the results of perceiving single objects (subjects) “here and
now” directly from image sequence analysis with spatiotemporal models are col-
lected in a “dynamic object database” (DOB) in symbolic form. Objects and sub-
jects are represented as members of special classes with an identification number,
their time of appearance, and their relative state defined by homogeneous coordi-
nates, as discussed in Section 2.1.1. Together with the algorithms for homogeneous
coordinate transformations and shape computation, this represents a very compact
but precise state and shape description. Data volumes required are decreased by
two to three orders of magnitude (KB instead of MB). Time histories of state vari-
ables are thus manageable for several (the most important) objects/subjects ob-
served.
For subjects, this allows recognizing and understanding maneuvers and behav-
iors of which one knows members of this type of subject class are capable (Appen-
dix A.4.6). Explicit representations of perceptual and behavioral capabilities of
subjects are a precondition for this performance level. Tables 3.1 and 3.3 list the
most essential capabilities and behavioral modes needed for road traffic partici-
pants. Based on data in the ring-buffer of the DOB for each subject observed, this
background knowledge now allows guessing the intentions of the other subject.
This qualitatively new information may additionally be stored in special slots of
the subject’s representation. Extended observations and comparisons to standards
for decisions–making and behavior realization now allows attributing additional
characteristic properties to the subject observed. Together with the methods avail-
able for predicting movements into the future (fast-in-advance simulation), this al-
lows predicting the likely movements of the other subject; both results can be
compared and assessed for dangerous situations encountered. Thus, real-time vi-
sion as propagated here is an animation process with several individuals based on
previous (actual) observations and inferences from a knowledge base of their inten-
tions (expected behavior).
This demanding process cannot be performed for all subjects in sight but is con-
fined to the most relevant ones nearby. Selecting and perceiving these most rele-
vant subjects correctly and focusing attention on them is one of the decisive tasks
to be performed steadily. The judgment, which subject is most relevant, also de-
pends on the task to be performed. When just cruising with ample time available,
the situation is different from the same cruising state in the leftmost of three lanes,

