Page 340 -
P. 340

Section 10.5  Fitting Using Probabilistic Models  308


                                 Example: Outliers and Line Fitting
                                 We wish to fit a line to a set of tokens that are at x i =(x i ,y i ). Some tokens
                            might be outliers, but we do not know which ones are. This means we can model the
                            process of generating a token as first, choosing whether it will come from the line
                            or be an outlier, and then, choosing the token conditioned on the original choice.
                            The first choice will be random, and we can write P(token comes from line) = π.
                            We have already given two models of how a point could be generated from a line
                            model. We model outliers as occuring uniformly and at random on the plane. This
                            means that we can write the probability of generating a token as
                               P(x i |a, b, c, π)= P(x i , line|a, b, c, π)+ P(x i , outlier|a, b, c, π)
                                             = P(x i |line,a,b,c)P(line) + P(x i |outlier,a,b,c)P(outlier)
                                             = P(x i |line,a,b,c)π + P(x i |outlier,a,b,c)(1 − π).

                            If we knew for every data item whether it came from the line or was an outlier,
                            then fitting the line would be simple; we would ignore all the outliers, and apply the
                            methods of Section 10.2.1 to the other points. Similarly, if we knew the line, then
                            estimating which point is an outlier and which is not would be straightforward (the
                            outliers are far from the line). The difficulty is that we do not; the key to resolving
                            this difficulty is repeated re-estimation (Section 10.5.3), which provides a standard
                            algorithm for this class of problem. Figure 10.9 shows typical results using the
                            standard algorithm.
                                 By a very small manipulation of the equations above (replace “line” with
                            “background” and “outlier” with “foreground”), we can represent a background
                            subtraction problem, too. We model the image in each frame of video as the
                            same, multiplied by some constant to take account of automatic gain control, but
                            with noise added. We model the noise as coming from some uniform source. Fig-
                            ures 10.10 and 10.11 show results, obtained with the standard algorithm for these
                            problems (Section 10.5.3).
                                 Example: Image Segmentation
                                 At each pixel in an image, we compute a d-dimensional feature vector x,
                            which might contain position, color, and texture information. We believe the image
                            contains g segments, and each pixel is produced by one of these segments. Thus, to
                            produce a pixel, we choose an image segment and then generate the pixel from the
                            model of that segment. We assume that the lth segment is chosen with probability
                            π l , and we model the density associated with the lth segment as a Gaussian, with
                            known covariance Σ and unknown mean θ l =(μ ) that depends on the particular
                                                                      l
                            segment. We encapsulate these parameters into a parameter vector to get Θ =
                            (π 1 ,... ,π g ,θ 1 ,...,θ g ). This means that we can write the probability of generating
                            a pixel vector x as

                                                      p(x|Θ) =   p(x|θ l )π l .
                                                               i
                            Fitting this model would be simple if we knew which segment produced which pixel,
                            because then we could estimate the mean of each segment separately. Similarly, if
                            we knew the means, we could estimate which segment produced the pixel. This is
                            quite a general situation.
   335   336   337   338   339   340   341   342   343   344   345