Page 26 -
P. 26

1.1 What is computer vision?                                                             5


               misperception that vision should be easy dates back to the early days of artificial intelligence
               (see Section 1.2), when it was initially believed that the cognitive (logic proving and plan-
               ning) parts of intelligence were intrinsically more difficult than the perceptual components
               (Boden 2006).
                  The good news is that computer vision is being used today in a wide variety of real-world
               applications, which include:

                  • Optical character recognition (OCR): reading handwritten postal codes on letters
                    (Figure 1.4a) and automatic number plate recognition (ANPR);
                  • Machine inspection: rapid parts inspection for quality assurance using stereo vision
                    with specialized illumination to measure tolerances on aircraft wings or auto body parts
                    (Figure 1.4b) or looking for defects in steel castings using X-ray vision;
                  • Retail: object recognition for automated checkout lanes (Figure 1.4c);

                  • 3D model building (photogrammetry): fully automated construction of 3D models
                    from aerial photographs used in systems such as Bing Maps;

                  • Medical imaging: registering pre-operative and intra-operative imagery (Figure 1.4d)
                    or performing long-term studies of people’s brain morphology as they age;

                  • Automotive safety: detecting unexpected obstacles such as pedestrians on the street,
                    under conditions where active vision techniques such as radar or lidar do not work
                    well (Figure 1.4e; see also Miller, Campbell, Huttenlocher et al. (2008); Montemerlo,
                    Becker, Bhat et al. (2008); Urmson, Anhalt, Bagnell et al. (2008) for examples of fully
                    automated driving);

                  • Match move: merging computer-generated imagery (CGI) with live action footage by
                    tracking feature points in the source video to estimate the 3D camera motion and shape
                    of the environment. Such techniques are widely used in Hollywood (e.g., in movies
                    such as Jurassic Park) (Roble 1999; Roble and Zafar 2009); they also require the use of
                    precise matting to insert new elements between foreground and background elements
                    (Chuang, Agarwala, Curless et al. 2002).

                  • Motion capture (mocap): using retro-reflective markers viewed from multiple cam-
                    eras or other vision-based techniques to capture actors for computer animation;

                  • Surveillance: monitoring for intruders, analyzing highway traffic (Figure 1.4f), and
                    monitoring pools for drowning victims;
                  • Fingerprint recognition and biometrics: for automatic access authentication as well
                    as forensic applications.
               David Lowe’s Web site of industrial vision applications (http://www.cs.ubc.ca/spider/lowe/
               vision.html) lists many other interesting industrial applications of computer vision. While the
               above applications are all extremely important, they mostly pertain to fairly specialized kinds
               of imagery and narrow domains.
                  In this book, we focus more on broader consumer-level applications, such as fun things
               you can do with your own personal photographs and video. These include:
   21   22   23   24   25   26   27   28   29   30   31