Page 30 - Dynamic Vision for Perception and Control of Motion
P. 30
14 1 Introduction
of the strong telecamera. Here, letters on the license plate can be read, and it can be
seen from the clearly visible second rearview mirror on the left-hand side that there
is a second car immediately in front of the car ahead. The number of pixels per
area on the same object in this image is one hundred times that of the wide-angle
images.
For inertial stabilization of the viewing direction when riding over a nonsmooth
surface or for aircraft flying in a turbulent air, an active camera suspension is
needed anyway. The simultaneous use of almost delay-free inertial measurements
(time derivatives such as angular rates and linear accelerations) and of images,
whose interpretation introduces several tenths of a second delay time, requires ex-
tended representations along the time axis. There is no single time for which it is
possible to make consistent sense of all data available. Only the notion of an “ex-
tended presence” allows arriving at an efficient invariant interpretation (in 4-D!).
For this reason, the multifocal, saccadic vision system is considered to be the pref-
erable solution for autonomous vehicles in general.
1.6 Influence of the Material Substrate on System Design:
Technical vs. Biological Systems
Biological vision systems have evolved over millions of generations with the selec-
tion of the fittest for the ecological environment encountered. The basic neural
substrate developed (carbon-based) may be characterized by a few numbers. The
electrochemical units do have switching times in the millisecond (ms) range; the
traveling speed of signals is in the 10 to 100 m/s range. Cross-connections between
units exist in abundance (1000 to 10 000 per neuron). A single brain consists of up
11
to 10 of these units. The main processing step is summation of the weighted input
signals which contain up to now unknown (multiple?) feedback loops [Handbook of
Physiology 1984, 1987].
These systems need long learning times and adapt to new situations only slowly.
In contrast, technical substrates for sensors and microprocessors (silicon-based)
6
have switching times in the nanosecond range (a factor of 10 compared to biologi-
cal systems). They are easily programmable and have various computational
modes between which they can switch almost instantaneously; however, the direct
cross-connections to other units are limited in number (one to six, usually) but may
have very high bandwidth (in the hundreds of MB/s range).
While a biological eye is a very complex unit containing several types and sizes
of sensors and computing elements, technical imaging sensors are rather simple up
to now and mostly homogeneous over the entire array area. However, from televi-
sion and computer graphics, it is well known that humans can interpret the images
thus generated without problems in a natural way if certain standards are main-
tained.
In developing dynamic machine vision, two groups of thinking have formed:
One tries to mimic biological vision systems on the silicon substrate available, and
the other continues to build on the engineering platform developed in systems– and
computer science.