Page 410 -
P. 410

Section 12.3  Registering Deformable Objects  378


                            to a verification score only if their orientation is similar to the orientation of the
                            silhouette edge to which they are being compared. The principle here is that the
                            more detailed the description of the edge point, the more likely one is to know
                            whether it came from the object.
                                 It is a bad idea to include invisible silhouette components in the score, so
                            the rendering should be capable of removing hidden lines. The silhouette is used
                            because edges internal to a silhouette may have low contrast under a bad choice of
                            illumination. This means that their absence may be evidence about the illumination
                            rather than the presence or absence of the object.
                                 Edge proximity tests can be quite unreliable. Even orientation information
                            doesn’t really overcome these difficulties. When we project a set of model bound-
                            aries into an image, the absence of edges lying near these boundaries could well be
                            a quite reliable sign that the model isn’t there, but the presence of edges lying near
                            the boundaries is not a particularly reliable sign that the object is there. For exam-
                            ple, in textured regions, there are many edge points grouped together. This means
                            that, in highly textured regions, it is possible to get high verification scores for
                            almost any model at almost any pose (e.g., see Figure 12.7). Notice that counting
                            similarity in edge orientation in the verification score hasn’t made any difference
                            here.
                                 We can tune the edge detector to smooth texture heavily, in the hope that
                            textured regions will disappear. This is a dodge, and a dangerous one, because it
                            usually affects the contrast sensitivity so that the objects disappear, too. However,
                            it can be made to work acceptably and is widely used.

                     12.3 REGISTERING DEFORMABLE OBJECTS
                            There are many applications that require registering deformable objects. For ex-
                            ample, one might wish to register a neutral view of a face to a view displaying
                            some emotion; in this case, the deformation of the face might reveal the emotion
                            (Section 12.3.1). As another example, one might wish to register a medical image
                            of an organ to another image of the same organ (Section 12.3.3). As yet another
                            example, one might encode a family of shapes as one model shape and a family of
                            deformations. Notoriously, D’Arcy Thompson argued that side views of different
                            fish should be seen as deformations of one another (Thompson 1992).
                                 Generally, we have registered objects by a search process that looks for a
                            minimum of a cost function. This applies in rather a general way to deformable
                            objects, but we usually cannot use RANSAC, because we cannot estimate the
                            parameters with a subset of tokens. As a result, registration is usually much slower.

                     12.3.1 Deforming Texture with Active Appearance Models
                            An important case is matching face images to one another, despite deformations
                            of the face, changes in head angle, and so on. In this case, the texture on the face
                            is an important cue driving the match. Cootes et al. (2001) model the face as a
                            plane mesh of triangles, as in Figure 12.8. Now assume that this mesh is placed
                            over an image of a face. If we knew their configuration in a neutral frontal view
                            of the face, we could generate the intensity field for that view. Call the original
                            image I o . For the moment, we will assume there is just one triangle in the mesh.
   405   406   407   408   409   410   411   412   413   414   415