Page 182 -

P. 182

3.7 Global optimization 161

scanned and ﬁnal images,
E d (i, j)= wδ(f(i, j),d(i, j)) (3.110)

and the smoothness penalty to reﬂect the agreement between neighboring pixels

E p (i, j)= E x (i, j)+ E y (i, j)= sδ(f(i, j),f(i +1,j)) + sδ(f(i, j),f(i, j +1)). (3.111)

Once we have formulated the energy, how do we minimize it? The simplest approach is
to perform gradient descent, ﬂipping one state at a time if it produces a lower energy. This ap-
proach is known as contextual classiﬁcation (Kittler and F¨ oglein 1984), iterated conditional
modes (ICM) (Besag 1986), or highest conﬁdence ﬁrst (HCF) (Chou and Brown 1990)ifthe
pixel with the largest energy decrease is selected ﬁrst.
Unfortunately, these downhill methods tend to get easily stuck in local minima. An al-
ternative approach is to add some randomness to the process, which is known as stochastic
gradient descent (Metropolis, Rosenbluth, Rosenbluth et al. 1953; Geman and Geman 1984).
When the amount of noise is decreased over time, this technique is known as simulated an-
nealing (Kirkpatrick, Gelatt, and Vecchi 1983; Carnevali, Coletti, and Patarnello 1985; Wol-
berg and Pavlidis 1985; Swendsen and Wang 1987) and was ﬁrst popularized in computer
vision by Geman and Geman (1984) and later applied to stereo matching by Barnard (1989),
among others.
Even this technique, however, does not perform that well (Boykov, Veksler, and Zabih
2001). For binary images, a much better technique, introduced to the computer vision com-
munity by Boykov, Veksler, and Zabih (2001) is to re-formulate the energy minimization as
a max-ﬂow/min-cut graph optimization problem (Greig, Porteous, and Seheult 1989). This
technique has informally come to be known as graph cuts in the computer vision community
(Boykov and Kolmogorov 2010). For simple energy functions, e.g., those where the penalty
for non-identical neighboring pixels is a constant, this algorithm is guaranteed to produce the
global minimum. Kolmogorov and Zabih (2004) formally characterize the class of binary
energy potentials (regularity conditions) for which these results hold, while newer work by
Komodakis, Tziritas, and Paragios (2008)and Rother, Kolmogorov, Lempitsky et al. (2007)
provide good algorithms for the cases when they do not.
In addition to the above mentioned techniques, a number of other optimization approaches
have been developed for MRF energy minimization, such as (loopy) belief propagation and
dynamic programming (for one-dimensional problems). These are discussed in more detail
in Appendix B.5 as well as the comparative survey paper by Szeliski, Zabih, Scharstein et al.
(2008).

Ordinal-valued MRFs

In addition to binary images, Markov random ﬁelds can be applied to ordinal-valued labels
such as grayscale images or depth maps. The term ”ordinal” indicates that the labels have an
implied ordering, e.g., that higher values are lighter pixels. In the next section, we look at
unordered labels, such as source image labels for image compositing.
In many cases, it is common to extend the binary data and smoothness prior terms as

E d (i, j)= w(i, j)ρ d (f(i, j) − d(i, j)) (3.112)

177 178 179 180 181 182 183 184 185 186 187