Page 256 - Introduction to Statistical Pattern Recognition

P. 256

238 Introduction to Statistical Pattern Recognition

1
= It -g, (X, Y)F(X) dX = 0 . (5.162)
h (X)=O ?I

That is, as long as i(X) = 0 at h (X) = 0, the effect of an individual sample is
negligible. Even if the quadratic classifier is not optimal, AE~ is dominated by
a 1/7L term. Thus, as one would expect, as the number of design samples
becomes larger, the effect of an individual sample diminishes.
In order to confirm the above results, the following experiment was con-
ducted.
Experiment 9: Effect of removing one sample
Data: I-I, I-41, I-A (Normal, n = 8)
Classifier: Quadratic classifier of (5.54)
Design samples: 7il = 712 = 24, 40, 80, 160, 320
Test: Theoretical using (3.119)-(3.128)
No. of trials: T = 10
Results: Table 5-1 1 [6]
Table 5-1 1 shows that, even if the squared distance of Y EO^) from MI, d2, is
much larger than n, the effect is still negligible. The expected value of d2 is n
when X is distributed normally.

5.4 Bootstrap Methods
Bootstrap Errors

Bootstrap method: So far, we have studied how to bound the Bayes
error based on available sample sets. That is, we draw z sample sets
SI,. . . ,ST, from the true distributions, P, as seen in Fig. 5-5, where each
sample set contains N I ol-samples and N2 m2-samples. For each S,, we can
A
apply the L and R methods to obtain E~, and E~,. The averages of these EL,’s
,.
and E~,’S over r sets approximate the upper and lower bounds of the Bayes
I A
error, E ( E~ ] and E ( E~ }. The standard deviations of r EL,’S and E~,’s indicate
,.
L
how E~ and E~ vary. However, in many cases in practice, only one sample set

251 252 253 254 255 256 257 258 259 260 261