Page 42 -
P. 42
1.3 Book overview 21
topics such as sampling and aliasing, color sensing, and in-camera compression.
Chapter 3 covers image processing, which is needed in almost all computer vision appli-
cations. This includes topics such as linear and non-linear filtering (Section 3.3), the Fourier
transform (Section 3.4), image pyramids and wavelets (Section 3.5), geometric transforma-
tions such as image warping (Section 3.6), and global optimization techniques such as regu-
larization and Markov Random Fields (MRFs) (Section 3.7). While most of this material is
covered in courses and textbooks on image processing, the use of optimization techniques is
more typically associated with computer vision (although MRFs are now being widely used
in image processing as well). The section on MRFs is also the first introduction to the use
of Bayesian inference techniques, which are covered at a more abstract level in Appendix B.
Chapter 3 also presents applications such as seamless image blending and image restoration.
In Chapter 4, we cover feature detection and matching. A lot of current 3D reconstruction
and recognition techniques are built on extracting and matching feature points (Section 4.1),
so this is a fundamental technique required by many subsequent chapters (Chapters 6, 7, 9
and 14). We also cover edge and straight line detection in Sections 4.2 and 4.3.
Chapter 5 covers region segmentation techniques, including active contour detection and
tracking (Section 5.1). Segmentation techniques include top-down (split) and bottom-up
(merge) techniques, mean shift techniques that find modes of clusters, and various graph-
based segmentation approaches. All of these techniques are essential building blocks that are
widely used in a variety of applications, including performance-driven animation, interactive
image editing, and recognition.
In Chapter 6, we cover geometric alignment and camera calibration. We introduce the
basic techniques of feature-based alignment in Section 6.1 and show how this problem can
be solved using either linear or non-linear least squares, depending on the motion involved.
We also introduce additional concepts, such as uncertainty weighting and robust regression,
which are essential to making real-world systems work. Feature-based alignment is then used
as a building block for 3D pose estimation (extrinsic calibration) in Section 6.2 and camera
(intrinsic) calibration in Section 6.3. Chapter 6 also describes applications of these techniques
to photo alignment for flip-book animations, 3D pose estimation from a hand-held camera,
and single-view reconstruction of building models.
Chapter 7 covers the topic of structure from motion, which involves the simultaneous
recovery of 3D camera motion and 3D scene structure from a collection of tracked 2D fea-
tures. This chapter begins with the easier problem of 3D point triangulation (Section 7.1),
which is the 3D reconstruction of points from matched features when the camera positions
are known. It then describes two-frame structure from motion (Section 7.2), for which al-
gebraic techniques exist, as well as robust sampling techniques such as RANSAC that can
discount erroneous feature matches. The second half of Chapter 7 describes techniques for
multi-frame structure from motion, including factorization (Section 7.3), bundle adjustment
(Section 7.4), and constrained motion and structure models (Section 7.5). It also presents
applications in view morphing, sparse 3D model construction, and match move.
In Chapter 8, we go back to a topic that deals directly with image intensities (as op-
posed to feature tracks), namely dense intensity-based motion estimation (optical flow). We
start with the simplest possible motion models, translational motion (Section 8.1), and cover
topics such as hierarchical (coarse-to-fine) motion estimation, Fourier-based techniques, and
iterative refinement. We then present parametric motion models, which can be used to com-