Page 340 -
P. 340
Section 10.5 Fitting Using Probabilistic Models 308
Example: Outliers and Line Fitting
We wish to fit a line to a set of tokens that are at x i =(x i ,y i ). Some tokens
might be outliers, but we do not know which ones are. This means we can model the
process of generating a token as first, choosing whether it will come from the line
or be an outlier, and then, choosing the token conditioned on the original choice.
The first choice will be random, and we can write P(token comes from line) = π.
We have already given two models of how a point could be generated from a line
model. We model outliers as occuring uniformly and at random on the plane. This
means that we can write the probability of generating a token as
P(x i |a, b, c, π)= P(x i , line|a, b, c, π)+ P(x i , outlier|a, b, c, π)
= P(x i |line,a,b,c)P(line) + P(x i |outlier,a,b,c)P(outlier)
= P(x i |line,a,b,c)π + P(x i |outlier,a,b,c)(1 − π).
If we knew for every data item whether it came from the line or was an outlier,
then fitting the line would be simple; we would ignore all the outliers, and apply the
methods of Section 10.2.1 to the other points. Similarly, if we knew the line, then
estimating which point is an outlier and which is not would be straightforward (the
outliers are far from the line). The difficulty is that we do not; the key to resolving
this difficulty is repeated re-estimation (Section 10.5.3), which provides a standard
algorithm for this class of problem. Figure 10.9 shows typical results using the
standard algorithm.
By a very small manipulation of the equations above (replace “line” with
“background” and “outlier” with “foreground”), we can represent a background
subtraction problem, too. We model the image in each frame of video as the
same, multiplied by some constant to take account of automatic gain control, but
with noise added. We model the noise as coming from some uniform source. Fig-
ures 10.10 and 10.11 show results, obtained with the standard algorithm for these
problems (Section 10.5.3).
Example: Image Segmentation
At each pixel in an image, we compute a d-dimensional feature vector x,
which might contain position, color, and texture information. We believe the image
contains g segments, and each pixel is produced by one of these segments. Thus, to
produce a pixel, we choose an image segment and then generate the pixel from the
model of that segment. We assume that the lth segment is chosen with probability
π l , and we model the density associated with the lth segment as a Gaussian, with
known covariance Σ and unknown mean θ l =(μ ) that depends on the particular
l
segment. We encapsulate these parameters into a parameter vector to get Θ =
(π 1 ,... ,π g ,θ 1 ,...,θ g ). This means that we can write the probability of generating
a pixel vector x as
p(x|Θ) = p(x|θ l )π l .
i
Fitting this model would be simple if we knew which segment produced which pixel,
because then we could estimate the mean of each segment separately. Similarly, if
we knew the means, we could estimate which segment produced the pixel. This is
quite a general situation.