Page 339 - Introduction to Statistical Pattern Recognition
P. 339
7 Nonparametric Classification and Error Estimation 32 1
,.
(2) Plot these t empirical points ENN vs. PI. Then, find the line best fitted to
these 1 points. The slope of this line is Ex { .) and the y-intercept is &iN,
which we would like to estimate.
,. ,.
The reader must be aware that eNN varies widely (the standard deviation of ENN
,.
for each P, is large), and that PI = 0 is far away from the P-region where the
actual experiments are conducted. Therefore, a small variation in tends to
be amplified and causes a large shift in the estimate of E,&.
Biases for Other Cases
2NN: The bias of the 2NN error can be obtained from (7.13) in a similar
fashion, resulting in [8]
,
E(&,N) K&;NN + P2Ex([ IA I-1’”tr{AB2(X)}]2} (7.38)
where
1
+
B2(X) =P-*’“(X)[VP(X)VTqI(X)p-I(X) ,v2qI(x)1 1 (7.39)
r( 1+4/n)N-4’f7 . (7.40)
1 +2/n
By comparing (7.40) with (7.37), it can be seen that P2 is roughly proportional
to N41r1 while PI is proportional to N-2”1. That is, as N increases, the 2NN
error converges to its asymptotic value more quickly than the NN error - as if
the dimensionality, n, were half as large. Also, note that P2 is significantly
smaller than PI, because r2’”/nx (.088 for n = X and .068 for n = 32) is
squared. Many experiments also have revealed that the 2NN error is less
biased than the NN error [I 11. Since their asymptotic errors are related by
;
~
E;,,, = 2 ~ from ~ (7.1 1) and (7.14), a better estimate of &LN could be obtained
by estimating &ZNN first and doubling it.

