Page 229 -
P. 229
CH APT E R 7
Stereopsis
Fusing the pictures recorded by our two eyes and exploiting the difference (or dis-
parity) between them allows us to gain a strong sense of depth. This chapter is
concerned with the design and implementation of algorithms that mimic our ability
to perform this task, known as stereopsis. Reliable computer programs for stereo-
scopic perception are of course invaluable in visual robot navigation (Figure 7.1),
cartography, aerial reconnaissance, and close-range photogrammetry. They are also
of great interest in tasks such as image segmentation for object recognition or the
construction of three-dimensional scene models for computer graphics applications.
FIGURE 7.1: Left: The Stanford cart sports a single camera moving in discrete increments
along a straight line and providing multiple snapshots of outdoor scenes. Center: The
INRIA mobile robot uses three cameras to map its environment. Right: The NYU
mobile robot uses two stereo cameras, each capable of delivering an image pair. As shown
by these examples, although two eyes are sufficient for stereo fusion, mobile robots are
sometimes equipped with three (or more) cameras. The bulk of this chapter is concerned
with binocular perception but stereo algorithms using multiple cameras are discussed in
Section 7.6. Photos courtesy of Hans Moravec, Olivier Faugeras, and Yann LeCun.
Stereo vision involves two processes: The fusion of features observed by two
(or more) eyes and the reconstruction of their three-dimensional preimage. The
latter is relatively simple: The preimage of matching points can (in principle) be
found at the intersection of the rays passing through these points and the associ-
ated pupil centers (or pinholes; see Figure 7.2, left). Thus, when a single image
feature is observed at any given time, stereo vision is easy. However, each picture
typically consists of millions of pixels, with tens of thousands of image features
such as edge elements, and some method must be devised to establish the correct
correspondences and avoid erroneous depth measurements (Figure 7.2, right).
We start this chapter by examining in Section 7.1 the geometric epipolar con-
straint associated with a pair of cameras, which is a key to controlling the cost
of the binocular fusion process. Next, we stay on the geometric side of things in
Section 7.2 as we present a number of methods for binocular reconstruction. After
197