Page 239 - Introduction to Statistical Pattern Recognition
P. 239
5 Parameter Estimation 22 1
mate of the error. Since each test sample is excluded from the design sample
set, the independence between the design and test sets is maintained. Also, all
N samples are tested and N-1 samples are used for design. Thus, the available
samples are, in this method, more effectively utilized. Furthermore, we do not
need to worry about dissimilarity between the design and test distributions.
One of the disadvantages of the L method is that N classifiers must be
designed, one classifier for testing each sample. However, this problem is
easily overcome by a procedure which will be discussed later.
The H and L methods are supposed to give very similar, if not identical,
estimates of the classification error, and both provide upper bounds of the
Bayes error. In order to confirm this, an experiment was conducted as follows.
Experiment 6: The H and L errors
Data: 1-1 (Normal, MTM = 2.562, E = 10%)
Dimensionality: n = 4, 8, 16, 32, 64
Classifier: Quadratic classifier of (5.54)
Sample size: n, n2 = kn (Design)
=
N I = N2 = kn (Test) for
NI =N~=knforL
k = 3, 5, 10, 20, 40
No. of trials: T= 10
Results: Table 5-8
The first and second lines of Table 5-8 show the average and standard devia-
tion of the H error estimate, while the third and fourth lines are the average and
standard deviation of the L error estimate. Both results are very close.
Operation of the L method: In order to illustrate how the L method
works, let us examine the simplest case in which two covariances are equal and
known as 1. Then, the Bayes classifier is
01
(x-M,)~(x-M~) - (X-M,)~(X-M,) >< r . (5.1 17)
w2
Assume that two sample sets, SI = (XI'), . . . ,Xg! } from o1 and S2 = {X\2),
. . . ,X$: } from 02, are given. In the R method, all of these samples are used
to design the classifier and also to test the classifier. With the given mathemat-