Page 182 -
P. 182
3.7 Global optimization 161
scanned and final images,
E d (i, j)= wδ(f(i, j),d(i, j)) (3.110)
and the smoothness penalty to reflect the agreement between neighboring pixels
E p (i, j)= E x (i, j)+ E y (i, j)= sδ(f(i, j),f(i +1,j)) + sδ(f(i, j),f(i, j +1)). (3.111)
Once we have formulated the energy, how do we minimize it? The simplest approach is
to perform gradient descent, flipping one state at a time if it produces a lower energy. This ap-
proach is known as contextual classification (Kittler and F¨ oglein 1984), iterated conditional
modes (ICM) (Besag 1986), or highest confidence first (HCF) (Chou and Brown 1990)ifthe
pixel with the largest energy decrease is selected first.
Unfortunately, these downhill methods tend to get easily stuck in local minima. An al-
ternative approach is to add some randomness to the process, which is known as stochastic
gradient descent (Metropolis, Rosenbluth, Rosenbluth et al. 1953; Geman and Geman 1984).
When the amount of noise is decreased over time, this technique is known as simulated an-
nealing (Kirkpatrick, Gelatt, and Vecchi 1983; Carnevali, Coletti, and Patarnello 1985; Wol-
berg and Pavlidis 1985; Swendsen and Wang 1987) and was first popularized in computer
vision by Geman and Geman (1984) and later applied to stereo matching by Barnard (1989),
among others.
Even this technique, however, does not perform that well (Boykov, Veksler, and Zabih
2001). For binary images, a much better technique, introduced to the computer vision com-
munity by Boykov, Veksler, and Zabih (2001) is to re-formulate the energy minimization as
a max-flow/min-cut graph optimization problem (Greig, Porteous, and Seheult 1989). This
technique has informally come to be known as graph cuts in the computer vision community
(Boykov and Kolmogorov 2010). For simple energy functions, e.g., those where the penalty
for non-identical neighboring pixels is a constant, this algorithm is guaranteed to produce the
global minimum. Kolmogorov and Zabih (2004) formally characterize the class of binary
energy potentials (regularity conditions) for which these results hold, while newer work by
Komodakis, Tziritas, and Paragios (2008)and Rother, Kolmogorov, Lempitsky et al. (2007)
provide good algorithms for the cases when they do not.
In addition to the above mentioned techniques, a number of other optimization approaches
have been developed for MRF energy minimization, such as (loopy) belief propagation and
dynamic programming (for one-dimensional problems). These are discussed in more detail
in Appendix B.5 as well as the comparative survey paper by Szeliski, Zabih, Scharstein et al.
(2008).
Ordinal-valued MRFs
In addition to binary images, Markov random fields can be applied to ordinal-valued labels
such as grayscale images or depth maps. The term ”ordinal” indicates that the labels have an
implied ordering, e.g., that higher values are lighter pixels. In the next section, we look at
unordered labels, such as source image labels for image compositing.
In many cases, it is common to extend the binary data and smoothness prior terms as
E d (i, j)= w(i, j)ρ d (f(i, j) − d(i, j)) (3.112)