Page 182 -
P. 182

3.7 Global optimization                                                                161


               scanned and final images,
                                        E d (i, j)= wδ(f(i, j),d(i, j))            (3.110)

               and the smoothness penalty to reflect the agreement between neighboring pixels

                E p (i, j)= E x (i, j)+ E y (i, j)= sδ(f(i, j),f(i +1,j)) + sδ(f(i, j),f(i, j +1)). (3.111)

                  Once we have formulated the energy, how do we minimize it? The simplest approach is
               to perform gradient descent, flipping one state at a time if it produces a lower energy. This ap-
               proach is known as contextual classification (Kittler and F¨ oglein 1984), iterated conditional
               modes (ICM) (Besag 1986), or highest confidence first (HCF) (Chou and Brown 1990)ifthe
               pixel with the largest energy decrease is selected first.
                  Unfortunately, these downhill methods tend to get easily stuck in local minima. An al-
               ternative approach is to add some randomness to the process, which is known as stochastic
               gradient descent (Metropolis, Rosenbluth, Rosenbluth et al. 1953; Geman and Geman 1984).
               When the amount of noise is decreased over time, this technique is known as simulated an-
               nealing (Kirkpatrick, Gelatt, and Vecchi 1983; Carnevali, Coletti, and Patarnello 1985; Wol-
               berg and Pavlidis 1985; Swendsen and Wang 1987) and was first popularized in computer
               vision by Geman and Geman (1984) and later applied to stereo matching by Barnard (1989),
               among others.
                  Even this technique, however, does not perform that well (Boykov, Veksler, and Zabih
               2001). For binary images, a much better technique, introduced to the computer vision com-
               munity by Boykov, Veksler, and Zabih (2001) is to re-formulate the energy minimization as
               a max-flow/min-cut graph optimization problem (Greig, Porteous, and Seheult 1989). This
               technique has informally come to be known as graph cuts in the computer vision community
               (Boykov and Kolmogorov 2010). For simple energy functions, e.g., those where the penalty
               for non-identical neighboring pixels is a constant, this algorithm is guaranteed to produce the
               global minimum. Kolmogorov and Zabih (2004) formally characterize the class of binary
               energy potentials (regularity conditions) for which these results hold, while newer work by
               Komodakis, Tziritas, and Paragios (2008)and Rother, Kolmogorov, Lempitsky et al. (2007)
               provide good algorithms for the cases when they do not.
                  In addition to the above mentioned techniques, a number of other optimization approaches
               have been developed for MRF energy minimization, such as (loopy) belief propagation and
               dynamic programming (for one-dimensional problems). These are discussed in more detail
               in Appendix B.5 as well as the comparative survey paper by Szeliski, Zabih, Scharstein et al.
               (2008).


               Ordinal-valued MRFs

               In addition to binary images, Markov random fields can be applied to ordinal-valued labels
               such as grayscale images or depth maps. The term ”ordinal” indicates that the labels have an
               implied ordering, e.g., that higher values are lighter pixels. In the next section, we look at
               unordered labels, such as source image labels for image compositing.
                  In many cases, it is common to extend the binary data and smoothness prior terms as

                                     E d (i, j)= w(i, j)ρ d (f(i, j) − d(i, j))    (3.112)
   177   178   179   180   181   182   183   184   185   186   187