Page 325 -
P. 325

Section 10.2  Fitting Lines and Planes  293


                                 the plane, five; for spheres in 3D, four; for axis-aligned ellipsoids in 3D, seven;
                                 and for general ellipsoids in 3D, 10. Even quite simple structures could result
                                 in high-dimensional accumulator arrays, which take unmanageable amounts
                                 of storage.

                               • Quantization errors: An appropriate grid size is difficult to pick. Too
                                 coarse a grid can lead to large values of the vote being obtained falsely because
                                 many quite different structures correspond to a single bucket. Too fine a grid
                                 can lead to structures not being found because votes resulting from tokens
                                 that are not exactly aligned end up in different buckets, and no bucket has a
                                 large vote (Figure 10.1).

                               • Noise: The attraction of the Hough transform is that it connects widely
                                 separated tokens that lie close to some structure. This is also a weakness
                                 because it is usually possible to find many quite good phantom structures
                                 in a large set of reasonably uniformly distributed tokens (Figure 10.2). This
                                 means that regions of texture can generate peaks in the voting array that are
                                 larger than those associated with the lines sought.
                                 These difficulties can be avoided, to some extent, by recognizing the Hough
                            transform as an attempt to find a mode in a distribution. The distribution is
                            represented by the voting array, and some of the problems are created by the cells
                            in that array. But to find a mode, we do not necessarily need to use an accumulator
                            array; instead, we could apply mean shift. The algorithm of Section 9.3.4 can be
                            applied directly. It can also be useful to ensure the minimum of irrelevant tokens
                            by, for example, tuning the edge detector to smooth out texture, using lighting that
                            ensures high-contrast edges, or using tokens with a more complex structure with
                            edge points.
                                 One natural application of the Hough transform is in object recognition. We
                            defer the details to Section 18.4.2, but the general picture is as follows. Imagine
                            an object is made out of known parts. These parts might move around a bit with
                            respect to one another, but are substantial enough to be detected on their own.
                            We can then expect each detected part to have an opinion about the location
                            (and, perhaps, the state) of the object. This means we could detect objects by
                            first detecting parts, then allowing each detected part to vote for location (and
                            maybe state) of object instances, and finally using the Hough transform, most
                            likely in mean shift form, to find instances on which many part detectors agree. This
                            approach has been successfully applied in numerous forms (Maji and Malik 2009).

                     10.2 FITTING LINES AND PLANES
                            In many applications, objects are characterized by the presence of straight lines. For
                            example, we might wish to build models of buildings using pictures of the buildings
                            (as in the application in Chapter 19). This application uses polyhedral models of
                            buildings, meaning that straight lines in the image are important. Similarly, many
                            industrial parts have straight edges of one form or another; if we wish to recognize
                            industrial parts in an image, straight lines could be helpful. In either case, a report
                            of all straight lines in the image is an extremely useful segmentation. All this means
                            that fitting a line to a set of plane tokens is extremely useful. Fitting planes to
   320   321   322   323   324   325   326   327   328   329   330