Page 337 -
P. 337

Section 10.4  Robustness  305




                            Determine:
                                n—the smallest number of points required (e.g., for lines, n =2,
                                  for circles, n =3)
                                k—the number of iterations required
                                t—the threshold used to identify a point that fits well
                                d—the number of nearby points required
                                  to assert a model fits well
                            Until k iterations have occurred
                                Draw a sample of n points from the data
                                  uniformly and at random
                                Fit to that set of n points
                                For each data point outside the sample
                                  Test the distance from the point to the structure
                                    against t; if the distance from the point to the structure
                                    is less than t, the point is close
                                end
                                If there are d or more points close to the structure
                                  then there is a good fit. Refit the structure using all
                                  these points. Add the result to a collection of good fits.
                            end
                            Use the best fit from this collection, using the
                              fitting error as a criterion

                             Algorithm 10.4: RANSAC: Fitting Structures Using Random Sample Consensus.


                                 Telling Whether a Point Is Close
                                 We need to determine whether a point lies close to a line fitted to a sample.
                            We do this by determining the distance between the point and the fitted line, and
                            testing that distance against a threshold d; if the distance is below the threshold,
                            the point lies close. In general, specifying this parameter is part of the modeling
                            process. Obtaining a value for this parameter is relatively simple. We generally need
                            only an order of magnitude estimate, and the same value applies to many different
                            experiments. The parameter is often determined by trying a few values and seeing
                            what happens; another approach is to look at a few characteristic datasets, fitting
                            a line by eye, and estimating the average size of the deviations.

                                 The Number of Points That Must Agree
                                 Assume that we have fitted a line to some random sample of two data points,
                            and we need to know whether that line is good. We do this by counting the number
                            of points that lie within some distance of the line (the distance was determined in
                            the previous section). In particular, assume that we know the probability that an
                            outlier lies in this collection of points; write this probability as y. We would like to
                            choose some number of points t such that the probability that all points near the
                                           t
                            line are outliers, y ,issmall (say, less than0.05). Notice that y ≤ (1 − w) (because
                            some outliers should be far from the line), so we could choose t such that (1 − w) t
   332   333   334   335   336   337   338   339   340   341   342