Page 26 - Dynamic Vision for Perception and Control of Motion
P. 26
10 1 Introduction
of its differential geometry description in curvature terms. Depending on the task at
hand, both the differential and the integral representation, or a combination of both
may be used for visual recognition. As will be shown for the example of road vehi-
cle guidance, the parallel use of these models in different parts of the overall rec-
ognition process and control system may be most efficient.
1.4.2.2 To Which Units Do Humans Affix Knowledge?
Objects and object classes play an important role in human language and in learn-
ing to understand “the world”. This is true for their appearance at one time, and
also for their motion behavior over time.
On the temporal axis, the combined use of differential and integral models may
allow us to refrain from computing optical flow or displacement vector fields,
which are very compute-intensive and susceptible to noise. Because of the huge
amount of data in a single image, this is not considered the best way to go, since an
early transition to the notion of physical objects or subjects with continuity condi-
tions in 3-D space and time has several advantages: (1) it helps cut the amount of
data required for adequate description, and (2) it yields the proper framework for
applying knowledge derived from previous encounters (dynamic models, stereo-
typical control maneuvers, etc.). For this reason, the second column in Figure 1.2 is
avoided intentionally in the 4-D approach. This step is replaced by the well-known
observer techniques in systems dynamics (Kalman filter and derivatives, Luenber-
ger observers). These recursive methods reconstruct the time derivatives of state
variables by prediction error feedback and knowledge about the dynamic behavior
of the object and (for the Kalman filter) of the statistical properties of the system
(dubbed “plant” in systems dynamics) and of the measurement processes. The
stereotypical behavioral capabilities of subjects in different situations form an im-
portant part of the knowledge base.
Two distinctly different types of “local temporal integrals” are used widely:
Single step integrals for video sampling and multiple step (local) integrals for ma-
neuver understanding. Through the imaging process, the analog motion process in
the real world is made discrete along the time axis. By forming the (approximate,
since linearized) integrals, the time span of the analog video cycle time (33 1/3 ms
in the United States and 40 ms in Europe, respectively, half these values for the
fields) is bridged by discrete transition matrices from kT to (k + 1)T, k = running
index.
Even though the intensity values of each pixel are integrals over the full range
or part of this period, they are interpreted as the actually sampled intensity value at
the time of camera readout. Since all basic interpretations of the situation rest on
these data, control output is computed newly only after this period; thus, it is con-
stant over the basic cycle time. This allows the analytical computation of the corre-
sponding state transitions, which are evaluated numerically for each cycle in the
recursive estimation process (Chapter 6); these are used for state prediction and in-
telligent control of image feature extraction.