Page 38 - Dynamic Vision for Perception and Control of Motion
P. 38
22 2 Basic Relations: Image Sequences – “the World”
Looking at 2-D data arrays generated by several hundred thousands of sen-
sor elements, come up with a distribution of objects in the real world and of
their relative motion. The sensor elements are arranged in a uniform array
on the chip, usually. Onboard vehicles, it cannot be assumed that the sensor
orientation is known beforehand or even stationary. However, inertial sen-
sors for linear acceleration components and rotational rates are available for
sensing ego-motion.
It is immediately clear that knowledge about object classes and the way their
visible features are mapped into the image plane is of great importance for image
sequence understanding. These objects may be grouped in classes with similar
functionality and/or appearance. The body of the vehicle carrying the sensors and
providing the means for locomotion is, of course, of utmost importance. The
lengthy description of the previous sentence will be abbreviated by the term: the
“own” body. To understand its motion directly and independently of vision, signals
from other sensors such as odometers, inertial angular rate sensors and linear ac-
celerometers as well as GPS (from the “Global Positioning System” providing geo-
graphic coordinates) are widely used.
Image data points carry no direct information on the distance at which their light
sources, which have stimulated the sensor signal are in the real world; the third di-
mension (range) is completely lost in a single image (except maybe for intensity at-
tenuation over longer distances). In addition, since perturbations may invalidate the
information content of a single pixel almost completely, useful image features con-
sist of signals from groups of sensor elements where local perturbations tend to be
leveled out. In biological systems, these are the receptive fields; in technical sys-
tems, these are evaluation masks of various sizes. This now allows a more precise
statement of the vision task:
By looking at the responses of feature extraction algorithms, try to find ob-
jects and subjects in the real world and their relative state to the own body.
When knowledge about motion characteristics or typical behaviors is avail-
able, exploit this in order to achieve better results and deeper understanding
by filtering the measurement data over time.
For simple massive objects (e.g., a stone, our sun and moon) and man-made ve-
hicles, good “dynamic models” describing motion constraints are known very of-
ten. To describe relative or absolute motion of objects precisely, suitable reference
coordinate systems have to be introduced. According to the wide scale of space ac-
cessible by vision, certain scales of representation are advantageous:
Sensor elements have dimensions in the micrometer range (Pm).
Humans operate directly in the meter (m) range: reaching space, single step
(body size).
For projectiles and fast vehicles, the range of immediate reactions extends to
several hundred meters or kilometers (km).
Missions may span several hundred to thousands of kilometers, even one-third
to one-half around the globe in direct flight.