Page 331 -
P. 331

Section 10.4  Robustness  299


                            which means that τ must satisfy the equation
                                             dx                dy
                                               (τ) {d x − x(τ)} +  (τ) {d y − y(τ)} =0.
                                             dt                dt
                            Now this is only one equation, rather than two, but the situation is not much better
                            than that for parametric curves. It is almost always the case that x(t)and y(t)are
                            polynomials because it is usually easier to do root finding for polynomials. At worst,
                            x(t)and y(t) are ratios of polynomials because we can rearrange the left-hand side
                            of our equation to come up with a polynomial in this case, too. However, we are still
                            faced with a possibly large number of roots. The underlying problem is geometric:
                            there may be many points on a curve that, locally, appear to be the closest point
                            to a given data point. This is because the curve is not flat (Figure 10.4). There
                            is no way to tell which is closest without checking each in turn. In some cases
                            (for example, circles), one can duck around this problem. This difficulty applies to
                            fitting curved surfaces to points in 3D as well.
                                 There are two strategies for dealing with this quite general problem. One
                            is to substitute some approximation for the distance between a point and a curve
                            (or, in 3D, a point and a surface), which is sometimes effective. The other is to
                            modify the representation of the curve or of the surface. For example, one might
                            represent a curve with a set of samples, or with a set of line segments. Similarly, a
                            surface might be represented with a set of samples, or a mesh. We could then use
                            the methods of Chapter 12 to register these representations to the data, or even to
                            deform them to fit the data.

                     10.4 ROBUSTNESS
                            All of the line fitting methods described involve squared error terms. This can
                            lead to poor fits in practice because a single wildly inappropriate data point might
                            give errors that dominate those due to many good data points; these errors could
                            result in a substantial bias in the fitting process (Figure 10.5). This effect results
                            from the squaring of the error. It is difficult to avoid such data points—usually
                            called outliers—in practice.  Errors in collecting or transcribing data points is
                            one important source of outliers. Another common source is a problem with the
                            model. Perhaps some rare but important effect has been ignored or the magnitude
                            of an effect has been badly underestimated. Finally, errors in correspondence are
                            particularly prone to generating outliers. Practical vision problems usually involve
                            outliers.
                                 This problem can be resolved either by reducing the influence of distant points
                            on the estimate (Section 10.4.1), or by identifying outlying points and ignoring
                            them. There are two methods to identify outlying points. We could search for good
                            points. A small set of good points will identify the thing we wish to fit; other good
                            points will agree, and the points that disagree are bad points. This is the basis of an
                            extremely important approach, described in Section 10.4.2. Alternatively, we could
                            regard this as a problem with missing data, and use the EM algorithm described
                            in Section 10.5.
   326   327   328   329   330   331   332   333   334   335   336