Page 380 - Introduction to Statistical Pattern Recognition
P. 380

362                        Introduction to Statistical Pattern Recognition

















                                Fig. 7-18  A criterion to eliminate a group of samples.


                      experiment,  the  tree  consists  of  3  levels  with  each  node  decomposed  to  3
                      nodes.  At  the  bottom  of  the  tree,  there  are  27  subsets containing  1000 sam-
                      ples.  However, for an  8-dimensional  uniform  distribution,  45 1 distance  com-
                      putations  are  needed  to  find  the NN  from  3000  samples.  The tree  is  formed
                      with  4  levels  and  4  decomposition,  which  yields  256  subsets  at  the  bottom
                      housing  3000  samples.  As  discussed  in  Chapter  6,  all  pairwise  distances
                      among samples become close,  as the  dimensionality  gets high.  Therefore,  the
                      effectiveness  of  (7.87)  to  eliminate  subsets  diminishes,  and  only  a  smaller
                      number of subsets are rejected by satisfying (7.87).

                           Another problem  of  this  method  is  how  to  divide  samples  into  subsets.
                      We will discuss this problem, which is called clustering, in Chapter 1 1.  Again,
                      finding clusters becomes more difficult, as the dimensionality  goes up.

                      Computer Projects


                      1.   Repeat  Experiment  3  for  Data  I-A.  Use  (a) I  and  (b)  (I +A)/2 as  the
                           metric.

                      2.   Repeat Experiment 5.

                      3.   Repeat Experiment 6.

                      4.   Repeat Experiment 8.

                      5.   Repeat Experiment 9.
                      6.   Repeat Experiment  1 1.
   375   376   377   378   379   380   381   382   383   384   385