Page 344 - Introduction to Statistical Pattern Recognition
P. 344
326 Introduction to Statistical Pattern Recognition
normal and uniform kernels respectively. The reason why the first and second
order approximations are used for the variance and bias respectively was dis-
cussed in Chapter 6. If the second order approximation for the variance is
adopted, we can obtain a more accurate but complex expression for (7.49).
Substituting (7.48) and (7.49) into (7.46) and (7.47),
1 1 I'-" 2N isl i
- -
E(Ah(X)) S-r2(a2-a1)+ --~'~(a?-a;)-At + - - (7.50)
2 8 PI
1 At
E (Ah2(X)) 3,r2(a2-a1 ) -All2 - yr4(a?-a;)
L 4
(7.5 1)
Note that from (6.18) and (6.19) the terms associated with r2ai are generated
by the bias of the density estimate, and the terms associated with r-"lN come
from the variance. The threshold adjustment At is a constant selected indepen-
dently.
Now, substituting (7.50) and (7.51) into (7.45) and carrying out the
integration, the bias is expressed in terms of I' and N as
E(A&} Zu1r2 +a2r4 +a3~-11/N. (7.52)
Here, the constants al, a2, and a3 are obtained by evaluating the indicated
integral expression in (7.45). Here, we assume, for simplicity, that the decision
threshold t is set to zero. Because of the complexity of the expressions, expli-
cit evaluation is not possible. However, the constants are only functions of the
distributions and the kernel shapes, A;, and are completely independent of the
sample size and the smoothing parameter, I'. Hence, (7.52) shows how changes
in I' and N affect the error performance of the classifier. The alr2 and a2r4
terms indicate how biases in the density estimates influence the performance of
the classifier, while the a31'-"lN term reflects the role of the variance of the
density estimates. For small values of I., the variance term dominates (7.52),
and the observed error rates are significantly above the Bayes error. As I'
grows, however, the variance term decreases while the u1r2 and a2v4 terms
play an increasingly significant role. Thus, for a typical plot of the observed
A
error rate versus I', E decreases for small values of I' until a minimum point is

