Page 334 - Introduction to Statistical Pattern Recognition
P. 334
316 Introduction to Statistical Pattern Recognition
bias is effected by each of the parameters of interest (n, N, A, and p (X)). Each
of these parameters will be discussed separately as follows.
Effect of sample size: Equation (7.37) gives an explicit expression
showing how the sample size affects the size of the bias of the NN error. Fig-
ure 7-5 shows vs. N for various values of n [8]. The bias tends to drop off
Fig. 7-5 PI vs. N.
rather slowly as the sample size increases, particularly when the dimensionality
of the data is high. This is not an encouraging result, since it tends to indicate
that increasing the sample size N is not an effective means of reducing the bias.
For example, with a dimensionality of 64, increasing the number of samples
from 1,000 to 10,000 results in only a 6.9% reduction in the bias (PI from
.0504 to .0469). Further reduction by 6.9% would require increasing the
number of samples to over 100,000. Thus it does not appear that the asymp-
totic NN error may be estimated simply by "choosing a large enough N" as
generally believed, especially when the dimensionality of the data is high. The
required value of N would be prohibitively large.

