Page 179 -
P. 179

158                                                                       3 Image processing


                                ple, (3.100) can be generalized to

                                                       =      s x (i, j)ρ(f(i +1,j) − f(i, j))      (3.104)
                                                  E 1r
                                                           i,j
                                                            + s y (i, j)ρ(f(i, j +1) − f(i, j)),
                                where ρ(x) is some monotonically increasing penalty function. For example, the family of
                                              p
                                norms ρ(x)= |x| is called p-norms. When p< 2, the resulting smoothness terms become
                                more piecewise continuous than totally smooth, which can better model the discontinuous
                                nature of images, flow fields, and 3D surfaces.
                                   An early example of robust regularization is the graduated non-convexity (GNC) algo-
                                rithm introduced by Blake and Zisserman (1987). Here, the norms on the data and derivatives
                                are clamped to a maximum value

                                                                        2
                                                            ρ(x) = min(x ,V ).                      (3.105)
                                Because the resulting problem is highly non-convex (it has many local minima), a continua-
                                tion method is proposed, where a quadratic norm (which is convex) is gradually replaced by
                                the non-convex robust norm (Allgower and Georg 2003). (Around the same time, Terzopou-
                                los (1988) was also using continuation to infer the tear and crease variables in his surface
                                interpolation problems.)
                                   Today, it is more common to use the L 1 (p =1) norm, which is often called total variation
                                (Chan, Osher, and Shen 2001; Tschumperl´ e and Deriche 2005; Tschumperl´ e 2006; Kaftory,
                                Schechner, and Zeevi 2007). Other norms, for which the influence (derivative) more quickly
                                decays to zero, are presented by Black and Rangarajan (1996); Black, Sapiro, Marimont et
                                al. (1998) and discussed in Appendix B.3.
                                   Even more recently, hyper-Laplacian norms with p< 1 have gained popularity, based
                                on the observation that the log-likelihood distribution of image derivatives follows a p ≈
                                0.5 − 0.8 slope and is therefore a hyper-Laplacian distribution (Simoncelli 1999; Levin and
                                Weiss 2007; Weiss and Freeman 2007; Krishnan and Fergus 2009). Such norms have an even
                                stronger tendency to prefer large discontinuities over small ones. See the related discussion
                                in Section 3.7.2 (3.114).
                                   While least squares regularized problems using L 2 norms can be solved using linear sys-
                                tems, other p-norms require different iterative techniques, such as iteratively reweighted least
                                squares (IRLS), Levenberg–Marquardt, or alternation between local non-linear subproblems
                                and global quadratic regularization (Krishnan and Fergus 2009). Such techniques are dis-
                                cussed in Section 6.1.3 and Appendices A.3 and B.3.

                                3.7.2 Markov random fields

                                As we have just seen, regularization, which involves the minimization of energy functionals
                                defined over (piecewise) continuous functions, can be used to formulate and solve a variety
                                of low-level computer vision problems. An alternative technique is to formulate a Bayesian
                                model, which separately models the noisy image formation (measurement) process, as well
                                as assuming a statistical prior model over the solution space. In this section, we look at
                                priors based on Markov random fields, whose log-likelihood can be described using local
                                neighborhood interaction (or penalty) terms (Kindermann and Snell 1980; Geman and Geman
                                1984; Marroquin, Mitter, and Poggio 1987; Li 1995; Szeliski, Zabih, Scharstein et al. 2008).
   174   175   176   177   178   179   180   181   182   183   184