Page 139 - Dynamic Vision for Perception and Control of Motion
P. 139
5 Extraction of Visual Features
In Chapters 2 and 3, several essential relations among features appearing in images
and objects in the real world have been discussed. In addition, basic properties of
members of the classes “objects” and “subjects” have been touched upon to enable
efficient recognition from image sequences. Not only spatial shape but also motion
capabilities have been described as background for understanding image sequences
of high frequency (video rate). This complex task can be broken down into three
consecutive stages (levels), each requiring specialized knowledge with some over-
lap. Since the data streams required for analysis are quite different in these stages,
namely (1) whole images, (2) image regions, and (3) symbolic descriptions, they
should be organized in specific data bases.
The first stage is to discover the following items in the entire fields of view (im-
ages): (a) what are characteristic image parameters of influence for interpreting the
image stream, and (b) where are regions of special interest in the images?
The answer to question (a) has to be determined to tap background knowledge
which allows deeper understanding of the answers found under (b). Typical ques-
tions to be answered by the results to complex (a) are (1) What are the lowest and
the highest image intensities found in each image? It is not so much the value of a
single pixel of interest here [which might be an outlier (data error)] but of small lo-
cal groups of pixels, which can be trusted more. (2) What are the lowest and high-
est intensity gradients (again evaluated by receptive fields containing several pix-
els)? (3) Are these values drastically different in different parts of the images?
Here, an indication of special image regions such as ‘above and below the hori-
zon’, or ‘near a light source or further away from it’ may be of importance. (4) Are
there large regions with approximately homogeneous color or texture distribution
(representing areas in the world with specific vegetation or snow cover, etc.)? At
what distance are they perceived?
Usually, the answer to (b) will show up in collections of certain features. Which
features are good indicators for objects of interest is, of course, domain specific.
Therefore, the knowledge base for this stage 1 concentrates on types and classes of
image features for certain task domains and environmental conditions; this will be
treated in Section 5.1.
At this level, only feature data are to be computed as background material for
the higher levels, which try to associate environmental aspects with these data sets
by also referring to the mission performed and to knowledge about the environ-
ment, taking time of day and year into account.
In the second stage, the question is asked ‘What type of object is it, generating
the feature set detected’, and ‘what is its relative state at the present time’? Of
course, this can be answered for only one object/subject at one time by a single in-

