Page 11 -
P. 11
x
problems whose solutions are still active research topics. Wherever possible, I encourage
students to try their algorithms on their own personal photographs, since this better motivates
them, often leads to creative variants on the problems, and better acquaints them with the
variety and complexity of real-world imagery.
In formulating and solving computer vision problems, I have often found it useful to draw
inspiration from three high-level approaches:
• Scientific: build detailed models of the image formation process and develop mathe-
matical techniques to invert these in order to recover the quantities of interest (where
necessary, making simplifying assumption to make the mathematics more tractable).
• Statistical: use probabilistic models to quantify the prior likelihood of your unknowns
and the noisy measurement processes that produce the input images, then infer the best
possible estimates of your desired quantities and analyze their resulting uncertainties.
The inference algorithms used are often closely related to the optimization techniques
used to invert the (scientific) image formation processes.
• Engineering: develop techniques that are simple to describe and implement but that
are also known to work well in practice. Test these techniques to understand their
limitation and failure modes, as well as their expected computational costs (run-time
performance).
These three approaches build on each other and are used throughout the book.
My personal research and development philosophy (and hence the exercises in the book)
have a strong emphasis on testing algorithms. It’s too easy in computer vision to develop an
algorithm that does something plausible on a few images rather than something correct. The
best way to validate your algorithms is to use a three-part strategy.
First, test your algorithm on clean synthetic data, for which the exact results are known.
Second, add noise to the data and evaluate how the performance degrades as a function of
noise level. Finally, test the algorithm on real-world data, preferably drawn from a wide
variety of sources, such as photos found on the Web. Only then can you truly know if your
algorithm can deal with real-world complexity, i.e., images that do not fit some simplified
model or assumptions.
In order to help students in this process, this books comes with a large amount of supple-
mentary material, which can be found on the book’s Web site http://szeliski.org/Book. This
material, which is described in Appendix C, includes:
• pointers to commonly used data sets for the problems, which can be found on the Web
• pointers to software libraries, which can help students get started with basic tasks such
as reading/writing images or creating and manipulating images
• slide sets corresponding to the material covered in this book
• a BibTeX bibliography of the papers cited in this book.
The latter two resources may be of more interest to instructors and researchers publishing
new papers in this field, but they will probably come in handy even with regular students.
Some of the software libraries contain implementations of a wide variety of computer vision
algorithms, which can enable you to tackle more ambitious projects (with your instructor’s
consent).