Page 320 -
P. 320

Section 9.6  Notes  288


                            algorithms that solve the relevant two-label Markov random field (see Vogler et al.
                            (2000); Boykov and Jolly (2001); or Boykov and Funka Lea (2006)). There are now
                            many important variants. Grabcut is due to Rother et al. (2004); Objcut uses prior
                            information about object shapes to improve the cut (Kumar et al. 2010); and see
                            also Duchenne et al. (2008). There are numerous matting methods, which Wang
                            and Cohen (2007) survey in detail.
                                 The normalized cuts formalism is due to Shi and Malik (1997) and (2000).
                            Variants include applications to motion segmentation Shi and Malik (1998a)and
                            methods for deducing similarity metrics from outputs Shi and Malik (1998b). There
                            are numerous alternate criteria (e.g., Cox et al. (1996), Perona and Freeman (1998)).
                                 There is a considerable early literature on the evaluation of segmentation.
                            Useful references include: Zhang (1996a); Zhang (1997); Beauchemin and Thom-
                            son (1997); Zhang and Gerbrands (1994); Correia and Pereira (2003); Lei and
                            Udupa (2003); Warfield et al. (2004); Paglieroni (2004); Cardoso and Corte Real
                            (2005); Cardoso and Corte Real (2006); Cardoso et al. (2009); Carleer et al. (2005);
                            and Crum et al. (2006). Evaluation is easier in the context of a specific task; pa-
                            pers dealing with assorted tasks include Yasnoff et al. (1977), Hartley et al. (1982),
                            Zhang (1996b), and Ranade and Prewitt (1980). Martin et al. (2001) introduced
                            the Berkeley segmentation dataset, which is now a standard for evaluation, but
                            there are a variety of criteria one can use. Unnikrishnan et al. (2007) use the Rand
                            index; Polak et al. (2009) use multiple object boundaries; Polak et al. (2009) give a
                            detailed evaluation of four segmentation algorithms; Hanbury and Stottinger (2008)
                            compare metrics; and Zhang et al. (2008) give a recent survey of evaluation meth-
                            ods. Good image segments are most likely internally coherent, but making that
                            idea useful is hard (Bagon et al. 2008).
                                 Since it is hard to get a segmentation right, Russell et al. (2006) suggest
                            working with multiple segmentations and then choosing good pieces. This idea is
                            now very influential. Multiple segmentations have been used to improve estimates
                            of support (Malisiewicz and Efros 2007), and to drive recognition (Pantofaru et
                            al. 2008) or (Malisiewicz and Efros 2008). One could organize the multiple segments
                            into an inclusion hierarchy (Tacc and Ahuja 1997); the hierarchies yield object
                            models (Todorovic and Ahuja 2008b), and can be matched (Todorovic and Ahuja
                            2008a).
                                 We haven’t discussed some aspects of perceptual organization in great detail
                            mainly because our emphasis is on exposition rather than historical accuracy, and
                            these methods follow from the unified view. For example, there is a long thread of
                            literature on clustering image edge points or line segments into configurations that
                            are unlikely to have arisen by accident. We cover some of these ideas in the following
                            chapter, but also draw the readers attention to Amir and Lindenbaum (1996),
                            Huttenlocher and Wayner (1992), Lowe (1985), Mohan and Nevatia (1992), Sarkar
                            and Boyer (1993), and to Sarkar and Boyer (1994). In building user interfaces, it can
                            (as we hinted before) be helpful to know what is perceptually salient (e.g., Saund
                            and Moran (1995)).
   315   316   317   318   319   320   321   322   323   324   325