Page 302 -
P. 302
6.1 2D and 3D feature-based alignment 281
6.1.4 Robust least squares and RANSAC
While regular least squares is the method of choice for measurements where the noise follows
a normal (Gaussian) distribution, more robust versions of least squares are required when
there are outliers among the correspondences (as there almost always are). In this case, it is
preferable to use an M-estimator (Huber 1981; Hampel, Ronchetti, Rousseeuw et al. 1986;
Black and Rangarajan 1996; Stewart 1999), which involves applying a robust penalty function
ρ(r) to the residuals
E RLS (Δp)= ρ( r i ) (6.25)
i
instead of squaring them.
We can take the derivative of this function with respect to p and set it to 0,
∂ r i ψ( r i ) T ∂r i
ψ( r i ) = r i =0, (6.26)
∂p r i ∂p
i i
where ψ(r)= ρ (r) is the derivative of ρ and is called the influence function. If we introduce
a weight function, w(r)=Ψ(r)/r, we observe that finding the stationary point of (6.25) using
(6.26) is equivalent to minimizing the iteratively reweighted least squares (IRLS) problem
2
E IRLS = w( r i ) r i , (6.27)
i
where the w( r i ) play the same local weighting role as σ −2 in (6.10). The IRLS algo-
i
rithm alternates between computing the influence functions w( r i ) and solving the result-
ing weighted least squares problem (with fixed w values). Other incremental robust least
squares algorithms can be found in the work of Sawhney and Ayer (1996); Black and Anan-
dan (1996); Black and Rangarajan (1996); Baker, Gross, Ishikawa et al. (2003) and textbooks
and tutorials on robust statistics (Huber 1981; Hampel, Ronchetti, Rousseeuw et al. 1986;
Rousseeuw and Leroy 1987; Stewart 1999).
While M-estimators can definitely help reduce the influence of outliers, in some cases,
starting with too many outliers will prevent IRLS (or other gradient descent algorithms) from
converging to the global optimum. A better approach is often to find a starting set of inlier
correspondences, i.e., points that are consistent with a dominant motion estimate. 7
Two widely used approaches to this problem are called RANdom SAmple Consensus, or
RANSAC for short (Fischler and Bolles 1981), and least median of squares (LMS) (Rousseeuw
1984). Both techniques start by selecting (at random) a subset of k correspondences, which is
then used to compute an initial estimate for p. The residuals of the full set of correspondences
are then computed as
r i = ˜ x (x i ; p) − ˆ x , (6.28)
i
i
where ˜ x are the estimated (mapped) locations and ˆ x are the sensed (detected) feature point
i
i
locations.
The RANSAC technique then counts the number of inliers that are within of their pre-
dicted location, i.e., whose r i ≤ . (The value is application dependent but is often
2
around 1–3 pixels.) Least median of squares finds the median value of the r i values. The
7 For pixel-based alignment methods (Section 8.1.1), hierarchical (coarse-to-fine) techniques are often used to
lock onto the dominant motion in a scene.