Page 306 -

P. 306

6.2 Pose estimation 285

p i = (X i,Y i,Z i,W i)

d i
d ij

x i
p j
ș ij d j
c x j

Figure 6.4 Pose estimation by the direct linear transform and by measuring visual angles and distances between
pairs of points.

In most applications, however, we have some prior knowledge about the intrinsic cali-
bration matrix K, e.g., that the pixels are square, the skew is very small, and the optical
center is near the center of the image (2.57–2.59). Such constraints can be incorporated into
a non-linear minimization of the parameters in K and (R, t), as described in Section 6.2.2.
In the case where the camera is already calibrated, i.e., the matrix K is known (Sec-
tion 6.3), we can perform pose estimation using as few as three points (Fischler and Bolles
1981; Haralick, Lee, Ottenberg et al. 1994; Quan and Lan 1999). The basic observation that
these linear PnP (perspective n-point) algorithms employ is that the visual angle between any
pair of 2D points ˆ x i and ˆ x j must be the same as the angle between their corresponding 3D
points p and p (Figure 6.4).
i
j
Given a set of corresponding 2D and 3D points {(ˆ x i , p )}, where the ˆ x i are unit directions
i
obtained by transforming 2D pixel measurements x i to unit norm 3D directions ˆ x i through
the inverse calibration matrix K,
ˆ x i = N(K −1 x i )= K −1 x i / K −1 x i , (6.36)

the unknowns are the distances d i from the camera origin c to the 3D points p , where
i
p = d i ˆ x i + c (6.37)
i
(Figure 6.4). The cosine law for triangle Δ(c, p , p ) gives us
i j
2
2
2
f ij (d i ,d j )= d + d − 2d i d j c ij − d =0, (6.38)
i j ij
where
(6.39)
c ij = cos θ ij = ˆ x i · ˆ x j
and
2
2
d = p − p . (6.40)
ij
i
j
We can take any triplet of constraints (f ij ,f ik ,f jk ) and eliminate the d j and d k using
2
Sylvester resultants (Cox, Little, and O’Shea 2007) to obtain a quartic equation in d ,
i
8
4
2
6
2
g ijk (d )= a 4 d + a 3 d + a 2 d + a 1 d + a 0 =0. (6.41)
i
i
i
i
i
(n−1)(n−2)
Given ﬁve or more correspondences, we can generate triplets to obtain a linear
2
6
8
4
2
estimate (using SVD) for the values of (d ,d ,d ,d ) (Quan and Lan 1999). Estimates for
i i i i

301 302 303 304 305 306 307 308 309 310 311