Page 334 -
P. 334

Section 10.4  Robustness  302




                            For s =1 to s = k
                              Draw a subset of r distinct points, chosen uniformly at random
                              Fit to this set of points using least squares to obtain an initial
                                set of parameters θ 0 s
                                       0
                              Estimate σ using θ 0 s
                                       s
                                                       n
                              Until convergence (usually |θ − θ n−1 | is small):
                                                       s
                                                           s
                                Take a minimizing step using θ s n−1 , σ n−1
                                                                s
                                  to get θ n
                                        s
                                Now compute σ s n
                              end
                            end
                            Report thebestfitofthis set of k trials, using the median of the residuals
                              as a criterion
                                 Algorithm 10.3: Using an M-Estimator to Fit a Least Squares Model.

                            A common strategy for dealing with this problem is to draw a subsample of the
                            dataset, fit to that subsample using least squares, and use this as a start point for
                            the fitting process. We do this for a large number of different subsamples, enough
                            to ensure that there is a high probability that there is at least one subsample that
                            consists entirely of good data points (Algorithm 10.3).
                                 Second, as Figures 10.7 and 10.8 indicate, the estimators require a sensible
                            estimate of σ, which is often referred to as scale.  Typically, the scale estimate is
                            supplied at each iteration of the solution method; a popular estimate of scale is

                                              σ (n)  =1.4826 median i |r (n) (x i ; θ (n−1) )| .
                                                                   i
                            We summarize a general M-estimator in Algorithm 10.3.


                     10.4.2 RANSAC: Searching for Good Points
                            An alternative to modifying the cost function is to search the collection of data
                            points for good points. This is quite easily done by an iterative process: First, we
                            choose a small subset of points and fit to that subset, and then we see how many
                            other points fit to the resulting object. We continue this process until we have a
                            high probability of finding the structure we are looking for.
                                 For example, assume that we are fitting a line to a dataset that consists of
                            about 50% outliers. We can fit a line to only two points. If we draw pairs of points
                            uniformly and at random, then about a quarter of these pairs will consist entirely
                            of good data points. We can identify these good pairs by noticing that a large
                            collection of other points lie close to the line fitted to such a pair. Of course, a
                            better estimate of the line could then be obtained by fitting a line to the points
                            that lie close to our current line.
                                 Fischler and Bolles (1981) formalized this approach into an algorithm—search
                            for a random sample that leads to a fit on which many of the data points agree.
                            The algorithm is usually called RANSAC, for RANdom SAmple Consensus, and is
   329   330   331   332   333   334   335   336   337   338   339