Page 30 - Introduction to Statistical Pattern Recognition

P. 30

12 Introduction to Statistical Pattern Recognition

where Pr(A) is the probability of an event A. For convenience, we often write
(2.2) as
P(X) = Pr(X 5x1 . (2.3)

Density function: Another expression for characterizing a random vector
is the density function, which is defined as
Pr(xl <xl <xl+Ax,, ..., ~~<x,~~x,,+Ax~~)
p(X) = lim
Av I +O AX^ . . .AX,,

Inversely, the distribution function can be expressed in terms of the density
function as follows:
P(X)=j p(Y)dY =I”’. 1 Yr,) dY I ’ .
-
X
.
.-ca
where ( .) dY is a shorthand notation for an n-dimensional integral, as
shown. -?he density function p (X) is not a probability but must be multiplied
by a certain region Ax I . . . Axrl (or AX ) to obtain a probability.
In pattern recognition, we deal with random vectors drawn from different
classes (or categories), each of which is characterized by its own density func-
tion. This density function is called the class i density or conditional density of
class i, and is expressed as

p(X I 0,) or p,(X) (i=l, . . . , L) , (2.6)
where 0, indicates class i and L is the number of classes. The unconditional
density function of X, which is sometimes called the mixture densiry function,
is given by

where Pi is a priori probability of class i.
Aposteriori probability: The a posteriori probability of mi given X,
P(wj X) or qi(X), can be computed by using the Bayes theorem, as follows:
I

25 26 27 28 29 30 31 32 33 34 35