Page 368 - Computational Statistics Handbook with MATLAB
P. 368

Chapter 9: Statistical Pattern Recognition                      357




                                                    Subtree − T
                                                              3
                                                          x1 < 0.031






                                                x2 < 0.51            x2 < 0.58








                                                            x1 < 0.49         x1 < 0.5
                                     C− 1              C− 2







                                                    C− 1          C− 2C− 2         C− 1

                               G
                                  9.1
                                  9.1
                               GU
                              F F F FI  II IG URE GU 9.1  RE RE RE 9.1 5  5 5 5
                               U
                              Here is the subtree corresponding to  k =  3   from Example 9.12. For this tree,  α =  0.03.
                                              reeeUsinganIndepanIndep
                             Selec
                                                e
                                         BBeestst
                                     th
                                      e
                                       ee
                                                e
                                                  an
                                                   e
                             SSeleele
                                                 Using
                                                                                ee
                             Sele  ct cctt nin  gg  thth  eB  Be  est  st  T  Tr TTrr eeeUsinganIndepIndep  eendentndent ndent  T  Teest  st  S  SSamplampl ample  e
                                                 Using
                                 ti inin
                                                                       TTeestst
                                                                endent
                                  g
                                    gt
                                    h
                                                                           Sampl
                             We first describe the independent test sample case, because it is easier to
                             understand. The notation that we use is summarized below.
                             NOTATION - INDEPENDENT TEST SAMPLE METHOD
                                    is the subset of the learning sample L that will be used for building
                                L 1
                                   the tree.
                                    is the subset of the learning sample L that will be used for testing
                                L 2
                                   the tree and choosing the best subtree.
                                  2 ()
                                n    is the number of cases in  L 2  .
                                  2 ()
                                n  j   is the number of observations in  L  2  that belong to class  ω  j  .
                            © 2002 by Chapman & Hall/CRC
   363   364   365   366   367   368   369   370   371   372   373