Page 26 -
P. 26
1.1 What is computer vision? 5
misperception that vision should be easy dates back to the early days of artificial intelligence
(see Section 1.2), when it was initially believed that the cognitive (logic proving and plan-
ning) parts of intelligence were intrinsically more difficult than the perceptual components
(Boden 2006).
The good news is that computer vision is being used today in a wide variety of real-world
applications, which include:
• Optical character recognition (OCR): reading handwritten postal codes on letters
(Figure 1.4a) and automatic number plate recognition (ANPR);
• Machine inspection: rapid parts inspection for quality assurance using stereo vision
with specialized illumination to measure tolerances on aircraft wings or auto body parts
(Figure 1.4b) or looking for defects in steel castings using X-ray vision;
• Retail: object recognition for automated checkout lanes (Figure 1.4c);
• 3D model building (photogrammetry): fully automated construction of 3D models
from aerial photographs used in systems such as Bing Maps;
• Medical imaging: registering pre-operative and intra-operative imagery (Figure 1.4d)
or performing long-term studies of people’s brain morphology as they age;
• Automotive safety: detecting unexpected obstacles such as pedestrians on the street,
under conditions where active vision techniques such as radar or lidar do not work
well (Figure 1.4e; see also Miller, Campbell, Huttenlocher et al. (2008); Montemerlo,
Becker, Bhat et al. (2008); Urmson, Anhalt, Bagnell et al. (2008) for examples of fully
automated driving);
• Match move: merging computer-generated imagery (CGI) with live action footage by
tracking feature points in the source video to estimate the 3D camera motion and shape
of the environment. Such techniques are widely used in Hollywood (e.g., in movies
such as Jurassic Park) (Roble 1999; Roble and Zafar 2009); they also require the use of
precise matting to insert new elements between foreground and background elements
(Chuang, Agarwala, Curless et al. 2002).
• Motion capture (mocap): using retro-reflective markers viewed from multiple cam-
eras or other vision-based techniques to capture actors for computer animation;
• Surveillance: monitoring for intruders, analyzing highway traffic (Figure 1.4f), and
monitoring pools for drowning victims;
• Fingerprint recognition and biometrics: for automatic access authentication as well
as forensic applications.
David Lowe’s Web site of industrial vision applications (http://www.cs.ubc.ca/spider/lowe/
vision.html) lists many other interesting industrial applications of computer vision. While the
above applications are all extremely important, they mostly pertain to fairly specialized kinds
of imagery and narrow domains.
In this book, we focus more on broader consumer-level applications, such as fun things
you can do with your own personal photographs and video. These include: