Page 339 -
P. 339

Section 10.5  Fitting Using Probabilistic Models  307


                            log-likelihood of the data under this model as

                                       L(a, b, c, σ)  =     log P(x i ,y i |a, b, c, σ)
                                                       i∈data

                                                   =        log P(ξ i |σ) + log P(u i ,v i |a, b, c).
                                                       i∈data

                            But P(u i ,v i |a, b, c) is some constant, because this point is distributed uniformly
                            along the line. Since ξ i is the perpendicular distance from (x i ,y i ) to the line (which
                                                            2
                                                       2
                            is ||(ax i + by i + c)|| as long as a + b = 1), we must maximize
                                                                  ξ 2 i  1     2
                                           log P(ξ i |σ)  =    −     −  log 2πσ
                                                                 2σ 2  2
                                     i∈data               i∈data
                                                                 (ax i + by i + c)  1    2
                                                                              2
                                                      =        −               −   log 2πσ
                                                                      2σ 2       2
                                                          i∈data
                                                 2
                                             2
                            (again, subject to a + b = 1). For fixed (but perhaps unknown) σ this yields the
                            problem we were working with in Section 10.2.1. So far, generative models have
                            just reproduced what we know already, but a powerful trick makes them much more
                            interesting.
                     10.5.1 Missing Data Problems
                            A number of important vision problems can be phrased as problems that happen to
                            be missing useful elements of the data. For example, we can think of segmentation
                            as the problem of determining from which of a number of sources a measurement
                            came. This is a general view. More specifically, fitting a line to a set of tokens
                            involves segmenting the tokens into outliers and inliers, then fitting the line to
                            the inliers; segmenting an image into regions involves determining which source of
                            color and texture pixels generated the image pixels; fitting a set of lines to a set
                            of tokens involves determining which tokens lie on which line; and segmenting a
                            motion sequence into moving regions involves allocating moving pixels to motion
                            models. Each of these problems would be easy if we happened to possess some data
                            that is currently missing (respectively, whether a point is an inlier or an outlier,
                            which region a pixel comes from, which line a token comes from, and which motion
                            model a pixel comes from).
                                 A missing data problem is a statistical problem where some data is missing.
                            There are two natural contexts in which missing data are important: In the first,
                            some terms in a data vector are missing for some instances and present for others
                            (perhaps someone responding to a survey was embarrassed by a question). In the
                            second, which is far more common in our applications, an inference problem can be
                            made much simpler by rewriting it using some variables whose values are unknown.
                            Fortunately, there is an effective algorithm for dealing with missing data problems;
                            in essence, we take an expectation over the missing data. We demonstrate this
                            method and appropriate algorithms with two examples.
   334   335   336   337   338   339   340   341   342   343   344