Page 331 -
P. 331
Section 10.4 Robustness 299
which means that τ must satisfy the equation
dx dy
(τ) {d x − x(τ)} + (τ) {d y − y(τ)} =0.
dt dt
Now this is only one equation, rather than two, but the situation is not much better
than that for parametric curves. It is almost always the case that x(t)and y(t)are
polynomials because it is usually easier to do root finding for polynomials. At worst,
x(t)and y(t) are ratios of polynomials because we can rearrange the left-hand side
of our equation to come up with a polynomial in this case, too. However, we are still
faced with a possibly large number of roots. The underlying problem is geometric:
there may be many points on a curve that, locally, appear to be the closest point
to a given data point. This is because the curve is not flat (Figure 10.4). There
is no way to tell which is closest without checking each in turn. In some cases
(for example, circles), one can duck around this problem. This difficulty applies to
fitting curved surfaces to points in 3D as well.
There are two strategies for dealing with this quite general problem. One
is to substitute some approximation for the distance between a point and a curve
(or, in 3D, a point and a surface), which is sometimes effective. The other is to
modify the representation of the curve or of the surface. For example, one might
represent a curve with a set of samples, or with a set of line segments. Similarly, a
surface might be represented with a set of samples, or a mesh. We could then use
the methods of Chapter 12 to register these representations to the data, or even to
deform them to fit the data.
10.4 ROBUSTNESS
All of the line fitting methods described involve squared error terms. This can
lead to poor fits in practice because a single wildly inappropriate data point might
give errors that dominate those due to many good data points; these errors could
result in a substantial bias in the fitting process (Figure 10.5). This effect results
from the squaring of the error. It is difficult to avoid such data points—usually
called outliers—in practice. Errors in collecting or transcribing data points is
one important source of outliers. Another common source is a problem with the
model. Perhaps some rare but important effect has been ignored or the magnitude
of an effect has been badly underestimated. Finally, errors in correspondence are
particularly prone to generating outliers. Practical vision problems usually involve
outliers.
This problem can be resolved either by reducing the influence of distant points
on the estimate (Section 10.4.1), or by identifying outlying points and ignoring
them. There are two methods to identify outlying points. We could search for good
points. A small set of good points will identify the thing we wish to fit; other good
points will agree, and the points that disagree are bad points. This is the basis of an
extremely important approach, described in Section 10.4.2. Alternatively, we could
regard this as a problem with missing data, and use the EM algorithm described
in Section 10.5.