Page 165 - Introduction to Statistical Pattern Recognition

P. 165

4 Parametric Classifiers 147

Other Desired Outputs and Search Techniques

In pattern recognition, the classifier should be designed by using samples
near the decision boundary; samples far from the decision boundary are less
important to the design. However, if we fix the desired output y(X) and try to
minimize the mean-square error between h (X) and y(X), larger h (X)’s contri-
bute more to the mean-square error. This has long been recognized as a disad-
vantage of a mean-square error approach in pattern recognition. In this section,
we discuss a modification which reduces this effect.

New notation for the discriminant function: Before proceeding, let us
introduce new notations which will simplify the discussion later. Instead of
(4.18), we will write the linear discriminant function as

h (x) = -V~X - \io > o for x E o1 , (4.69)

+
h(~) V~X v(l > o for x E o2 (4.70)
.
=
Furthermore, if we introduce a new vector to express a sample as

z = [-I -XI . . . -xfflT for x E ol (4.71)
,

z = [+I s, . . . xf1lT for x E o2 , (4.72)

then, the discriminant function becomes simply

(4.73)

where zo is either +I or -1, and MJ, = I,, (i = 0,1, . . . ,n).
Thus, our design procedure is
(1) to generate a new set of vectors Z’s from X’s, and
(2) to find WT so as to satisfy (4.73) for as many Z’s as possible.

Desired outputs: Using the notation of (4.73), new desired outputs will
be introduced. Also, the expectation in (4.64) is replaced by the sample mean
to obtain the following mean-square errors:

160 161 162 163 164 165 166 167 168 169 170