Page 332 - Computational Statistics Handbook with MATLAB

P. 332

Chapter 9: Statistical Pattern Recognition 321

assumed to be equal. In the piston ring example, we know how many parts
we buy from each manufacturer. So, the prior probability that the part came
from a certain manufacturer would be based on the percentage of parts
obtained from that manufacturer. In other applications, we might know the
prevalence of some class in our population. This might be the case in medical
diagnosis, where we have some idea of the percentage of the population who
are likely to have a certain disease or medical condition. In the case of the
iris data, we could estimate the prior probabilities using the proportion of
each class in our sample. We had 150 observed feature vectors, with 50 com-
ing from each class. Therefore, our estimated prior probabilities would be

ˆ n j 50
(
P ω j ) = ---- = --------- = 0.33; j = 12,, . 3
N 150
Finally, we might use equal priors when we believe each class is equally
likely.
ˆ
(
Now that we have our prior probabilities, P ω j ) , we turn our attention to
(
the class-conditional probabilities P x ω j ) . We can use the density estimation
techniques covered in Chapter 8 to obtain these probabilities. In essence, we
take all of the observed feature vectors that are known to come from class ω j
and estimate the density using only those cases. We will cover two
approaches: parametric and nonparametric.

Esti Estim ma at nin gClas Class s- ss-- CondCond itiontion al aall l ProbabiliProbabilit ProbabiliProbabili eie s s: ss:: :P PPaarr amam et eett tricricM M ethod
g
a
i
ethod
ramam
ar
Pa
e
-CondCond
ClasClas
tt
ricric
MM ethodethod
ii
tiontion
ti ieie
gg
mm aatt
EEstisti
ti inin
In parametric density estimation, we assume a distribution for the class-con-
ditional probability densities and estimate them by estimating the corre-
sponding distribution parameters. For example, we might assume the
features come from a multivariate normal distribution. To estimate the den-
ˆ ˆ
sity, we have to estimate µ j and Σ j for each class. This procedure is illustrated
in Example 9.1 for the iris data.
Example 9.1
In this example, we estimate our class-conditional probabilities using the
iris data. We assume that the required probabilities are multivariate normal
for each class. The following MATLAB code shows how to get the class-con-
ditional probabilities for each species of iris.
load iris
% This loads up three matrices:
% setosa, virginica and versicolor
% We will assume each class is multivariate normal.
% To get the class-conditional probabilities, we
% get estimates for the parameters for each class.
muset = mean(setosa);
© 2002 by Chapman & Hall/CRC

327 328 329 330 331 332 333 334 335 336 337