Page 325 -
P. 325
Section 10.2 Fitting Lines and Planes 293
the plane, five; for spheres in 3D, four; for axis-aligned ellipsoids in 3D, seven;
and for general ellipsoids in 3D, 10. Even quite simple structures could result
in high-dimensional accumulator arrays, which take unmanageable amounts
of storage.
• Quantization errors: An appropriate grid size is difficult to pick. Too
coarse a grid can lead to large values of the vote being obtained falsely because
many quite different structures correspond to a single bucket. Too fine a grid
can lead to structures not being found because votes resulting from tokens
that are not exactly aligned end up in different buckets, and no bucket has a
large vote (Figure 10.1).
• Noise: The attraction of the Hough transform is that it connects widely
separated tokens that lie close to some structure. This is also a weakness
because it is usually possible to find many quite good phantom structures
in a large set of reasonably uniformly distributed tokens (Figure 10.2). This
means that regions of texture can generate peaks in the voting array that are
larger than those associated with the lines sought.
These difficulties can be avoided, to some extent, by recognizing the Hough
transform as an attempt to find a mode in a distribution. The distribution is
represented by the voting array, and some of the problems are created by the cells
in that array. But to find a mode, we do not necessarily need to use an accumulator
array; instead, we could apply mean shift. The algorithm of Section 9.3.4 can be
applied directly. It can also be useful to ensure the minimum of irrelevant tokens
by, for example, tuning the edge detector to smooth out texture, using lighting that
ensures high-contrast edges, or using tokens with a more complex structure with
edge points.
One natural application of the Hough transform is in object recognition. We
defer the details to Section 18.4.2, but the general picture is as follows. Imagine
an object is made out of known parts. These parts might move around a bit with
respect to one another, but are substantial enough to be detected on their own.
We can then expect each detected part to have an opinion about the location
(and, perhaps, the state) of the object. This means we could detect objects by
first detecting parts, then allowing each detected part to vote for location (and
maybe state) of object instances, and finally using the Hough transform, most
likely in mean shift form, to find instances on which many part detectors agree. This
approach has been successfully applied in numerous forms (Maji and Malik 2009).
10.2 FITTING LINES AND PLANES
In many applications, objects are characterized by the presence of straight lines. For
example, we might wish to build models of buildings using pictures of the buildings
(as in the application in Chapter 19). This application uses polyhedral models of
buildings, meaning that straight lines in the image are important. Similarly, many
industrial parts have straight edges of one form or another; if we wish to recognize
industrial parts in an image, straight lines could be helpful. In either case, a report
of all straight lines in the image is an extremely useful segmentation. All this means
that fitting a line to a set of plane tokens is extremely useful. Fitting planes to