Page 337 -
P. 337
Section 10.4 Robustness 305
Determine:
n—the smallest number of points required (e.g., for lines, n =2,
for circles, n =3)
k—the number of iterations required
t—the threshold used to identify a point that fits well
d—the number of nearby points required
to assert a model fits well
Until k iterations have occurred
Draw a sample of n points from the data
uniformly and at random
Fit to that set of n points
For each data point outside the sample
Test the distance from the point to the structure
against t; if the distance from the point to the structure
is less than t, the point is close
end
If there are d or more points close to the structure
then there is a good fit. Refit the structure using all
these points. Add the result to a collection of good fits.
end
Use the best fit from this collection, using the
fitting error as a criterion
Algorithm 10.4: RANSAC: Fitting Structures Using Random Sample Consensus.
Telling Whether a Point Is Close
We need to determine whether a point lies close to a line fitted to a sample.
We do this by determining the distance between the point and the fitted line, and
testing that distance against a threshold d; if the distance is below the threshold,
the point lies close. In general, specifying this parameter is part of the modeling
process. Obtaining a value for this parameter is relatively simple. We generally need
only an order of magnitude estimate, and the same value applies to many different
experiments. The parameter is often determined by trying a few values and seeing
what happens; another approach is to look at a few characteristic datasets, fitting
a line by eye, and estimating the average size of the deviations.
The Number of Points That Must Agree
Assume that we have fitted a line to some random sample of two data points,
and we need to know whether that line is good. We do this by counting the number
of points that lie within some distance of the line (the distance was determined in
the previous section). In particular, assume that we know the probability that an
outlier lies in this collection of points; write this probability as y. We would like to
choose some number of points t such that the probability that all points near the
t
line are outliers, y ,issmall (say, less than0.05). Notice that y ≤ (1 − w) (because
some outliers should be far from the line), so we could choose t such that (1 − w) t