Page 210 - Introduction to Statistical Pattern Recognition
P. 210

192                        Introduction  to Statistical Pattern Recognition


                                                                                   A
                      Only  16 samples (3.9 times the dimensionality) are needed to achieve E (p) =
                      0.223 in  a 4  dimensional space, while  9396 samples (73.4  times the  dimen-
                      sionality) are needed in a  128 dimensional space.  This result is sharply con-
                      trasted with the common belief that a fixed multiple of the dimensionality, such
                      as 10, could be used to determine the sample size.
                           Since the theoretical results of (5.27) and (5.32) for biases and (5.28) and
                      (5.33)  for  variances are  approximations, we  conducted three  sets  of  experi-
                      ments to verify the closeness of  the approximations.


                           Experiment 1: Computation of pI and p2
                                 Data:  1-1, I-41, I-A (Normal)
                                 Dimensionality: n  = 4, 8, 16, 32, 64 (for I-I,1-4r)
                                               n  = 8 (for I-A)
                                 Sample size: N I  = N2 = kn, k  = 3, 5, 10, 20,40
                                 No. of trials: z = 10
                                 Results: Tables 5-3, 5-4, 5-5 [4]



                       Tables 5-3 and  5-4 present  a  comparison of  the  theoretical predictions (first
                       lines) and  the  means of  the  10 trials (second lines) for Data 1-1 and Data 1-41
                       respectively.  These tables show that  the  theoretical predictions of  the  biases
                       match the experimental results very closely.  The third lines of Tables 5-3 and
                       5-4 shows the theoretical predictions of the standard deviations from (5.28) and
                       (5.33).  The fourth lines present the experimental standard derivations from the
                       10  trials.  Again  the  theoretical  predictions  match  the  experimental  results
                       closely.  It should be noted that the variances for i2 of Data 1-1 and   of Data
                       1-41 are  zero  theoretically.  This  suggests that  the  variances for  these  cases
                       come from the Taylor expansion terms higher than second order, and therefore
                       are  expected to  be  smaller than  the  variances for  the  other  cases.  This  is
                                                              A
                       confirmed by comparing the variances between pl and pz in each Table.  Also,
                                               for
                                                  Data
                       note that the  variances of i2 1-41 are independent of  n.  The similar
                       results for Data I-A are presented in  Table 5-5.  The experimental results are
                       well predicted by the theoretical equations for a wide range of k.

                            Verification of  the estimation procedure: The estimation procedure of
                       (5.6) was tested on Data RADAR as follows.
   205   206   207   208   209   210   211   212   213   214   215