Page 25 - Dynamic Vision for Perception and Control of Motion
P. 25

1.4  What are Appropriate Interpretation Spaces?      9


            such as spatial or temporal change rates, spatial gradients, or directions of extreme
            values such as intensity gradients are typical examples.
              These differentials have shown to be powerful concepts for representing knowl-
            edge about physical properties of classes of objects. Differential equations repre-
            sent the natural mathematical element for coding knowledge about motion proc-
            esses in the real world. With the advent of the Kalman filter [Kalman 1960], they
            have become the key element for obtaining the best state estimate of the variables
            describing the system, based on recursive methods implementing a least-squares
            model fit. Real-time visual perception of moving objects is hardly possible without
            this very efficient approach.


            1.4.2 Local Integrals as Central Elements for Perception

            Note that the precise definition of what is local depends on the problem domain in-
            vestigated and may vary in a wide range. The third column and row in Figure 1.2
            are devoted to “local integrals”; this term again is rather fuzzy and will be defined
            more precisely in the task context. On the timescale, it means the transition from
            analog (continuous, differential) to digital (sampled, discrete) representations. In
            the spatial domain, typical local integrals are rigid bodies, which may move as a
            unit without changing their 3-D shape.
              These elements are defined such that the intersection in field (3, 3) in Figure 1.2
            becomes the central hub for data interpretation and data fusion: it contains the in-
            dividual objects as units to which humans attach most of their knowledge about the
            real world. Abstraction of properties has lead to generic classes which allow sub-
            suming a large variety of single cases into one generic concept, thereby leading to
            representational efficiency.

            1.4.2.1 Where is the Information in an Image?
            It is well known that information in an  image is contained in local intensity
            changes: A uniformly gray image has only a few bits of information, namely, (1)
            the gray value and (2) uniform distribution of this value over the entire image. The
            image may be completely described by three bytes, even though the amount of data
            may be about 400 000 bytes in a TV frame or even 4 MB (2k × 2k pixels). If there
            are certain areas of uniform gray values, the boundary lines of these areas plus the
            internal gray values contain all the information in the image. This object in the im-
            age plane may be described with much less data than the pixel values it encom-
            passes.
            In a more general form, image areas defined by a set of properties (shape, texture,
            color, joint motion, etc.) may be considered image objects, which originated from
            3-D objects by perspective mapping. Due to the numerous aspect conditions, which
            such an object may adopt relative to the camera, its potential appearances in the
            image plane are very diverse. Their representation will require orders of magnitude
            more data for an exhaustive description than its representation in 3-D space plus
            the laws of perspective mapping, which are the same for all objects. Therefore, an
            object is defined by its 3-D shape, which may be considered a local spatial integral
   20   21   22   23   24   25   26   27   28   29   30