Page 39 -
P. 39

18                                                                            1 Introduction


                                strictly necessary to read all of the material in sequence.
                                   Figure 1.11 shows a rough layout of the contents of this book. Since computer vision
                                involves going from images to a structural description of the scene (and computer graphics
                                the converse), I have positioned the chapters horizontally in terms of which major component
                                they address, in addition to vertically according to their dependence.
                                   Going from left to right, we see the major column headings as Images (which are 2D
                                in nature), Geometry (which encompasses 3D descriptions), and Photometry (which encom-
                                passes object appearance). (An alternative labeling for these latter two could also be shape
                                and appearance—see, e.g., Chapter 13 and Kang, Szeliski, and Anandan (2000).) Going
                                from top to bottom, we see increasing levels of modeling and abstraction, as well as tech-
                                niques that build on previously developed algorithms. Of course, this taxonomy should be
                                taken with a large grain of salt, as the processing and dependencies in this diagram are not
                                strictly sequential and subtle additional dependencies and relationships also exist (e.g., some
                                recognition techniques make use of 3D information). The placement of topics along the hor-
                                izontal axis should also be taken lightly, as most vision algorithms involve mapping between
                                at least two different representations. 9
                                   Interspersed throughout the book are sample applications, which relate the algorithms
                                and mathematical material being presented in various chapters to useful, real-world applica-
                                tions. Many of these applications are also presented in the exercises sections, so that students
                                can write their own.
                                   At the end of each section, I provide a set of exercises that the students can use to imple-
                                ment, test, and refine the algorithms and techniques presented in each section. Some of the
                                exercises are suitable as written homework assignments, others as shorter one-week projects,
                                and still others as open-ended research problems that make for challenging final projects.
                                Motivated students who implement a reasonable subset of these exercises will, by the end of
                                the book, have a computer vision software library that can be used for a variety of interesting
                                tasks and projects.
                                   As a reference book, I try wherever possible to discuss which techniques and algorithms
                                work well in practice, as well as providing up-to-date pointers to the latest research results in
                                the areas that I cover. The exercises can be used to build up your own personal library of self-
                                tested and validated vision algorithms, which is more worthwhile in the long term (assuming
                                you have the time) than simply pulling algorithms out of a library whose performance you do
                                not really understand.
                                   The book begins in Chapter 2 with a review of the image formation processes that create
                                the images that we see and capture. Understanding this process is fundamental if you want
                                to take a scientific (model-based) approach to computer vision. Students who are eager to
                                just start implementing algorithms (or courses that have limited time) can skip ahead to the
                                next chapter and dip into this material later. In Chapter 2, we break down image formation
                                into three major components. Geometric image formation (Section 2.1) deals with points,
                                lines, and planes, and how these are mapped onto images using projective geometry and other
                                models (including radial lens distortion). Photometric image formation (Section 2.2) covers
                                radiometry, which describes how light interacts with surfaces in the world, and optics, which
                                projects light onto the sensor plane. Finally, Section 2.3 covers how sensors work, including

                                  9  For an interesting comparison with what is known about the human visual system, e.g., the largely parallel what
                                and where pathways, see some textbooks on human perception (Palmer 1999; Livingstone 2008).
   34   35   36   37   38   39   40   41   42   43   44