Page 334 - Introduction to Statistical Pattern Recognition
P. 334

316                         Introduction to Statistical Pattern Recognition


                     bias is effected by  each of the parameters of  interest (n, N, A, and p (X)). Each
                     of these parameters will be discussed separately as follows.

                          Effect  of  sample  size:  Equation  (7.37)  gives  an  explicit  expression
                     showing how  the sample size affects the size of  the bias of the NN error.  Fig-
                     ure 7-5 shows   vs. N  for various values of  n [8].  The bias tends to drop off
































                                              Fig. 7-5  PI vs. N.

                      rather slowly as the sample size increases, particularly when the dimensionality
                      of the data is high.  This is not an encouraging result, since it tends to indicate
                      that increasing the sample size N  is not an effective means of reducing the bias.
                      For  example, with  a dimensionality of  64,  increasing the  number  of  samples
                      from  1,000 to  10,000 results  in  only  a  6.9%  reduction  in  the  bias (PI  from
                      .0504  to  .0469).  Further  reduction  by  6.9%  would  require  increasing  the
                      number of  samples to over  100,000.  Thus  it does not  appear that  the  asymp-
                      totic NN  error may  be  estimated  simply  by  "choosing a  large enough N"  as
                      generally believed, especially when the dimensionality of  the data is high.  The
                      required value of N  would be prohibitively large.
   329   330   331   332   333   334   335   336   337   338   339