Page 353 - Computational Statistics Handbook with MATLAB
P. 353

342                        Computational Statistics Handbook with MATLAB


                             For the given thresholds, we now find the probability of correctly classifying
                             the target cases. This corresponds to steps 8 through 13.

                                % Now find the probability of correctly
                                % classifying targets.
                                mu2 = mean(x2);
                                var2 = cov(x2);
                                % Do cross-validation on target class.
                                for i = 1:n1
                                    train = x1;
                                    test = x1(i);
                                    train(i) = [];
                                    mu1 = mean(train);
                                    var1 = cov(train);
                                    lr1(i) = csevalnorm(test,mu1,var1)./...
                                      csevalnorm(test,mu2,var2);
                                end
                                % Find the actual pcc.
                                for i = 1:length(pfa)
                                   pcc(i) = length(find(lr1 >= thresh(i)));
                                end
                                pcc = pcc/n1;
                             The ROC curve is given in Figure 9.9. We estimate the area under the curve
                             as 0.91, using

                                area = sum(pcc)*.01;







                             9.4 Classification Trees
                             In this section, we present another technique for pattern recognition called
                             classification trees. Our treatment of classification trees follows that in the
                             book called Classification and Regression Trees by Breiman, Friedman, Olshen
                             and Stone [1984]. For ease of exposition, we do not include the MATLAB code
                             for the classification tree in the main body of the text, but we do include it in
                             Appendix D. There are several main functions that we provide to work with
                             trees, and these are summarized in Table 9.1. We will be using these functions
                             in the text when we discuss the classification tree methodology.
                              While Bayes decision theory yields a classification rule that is intuitively
                             appealing, it does not provide insights about the structure or the nature of the
                             classification rule or help us determine what features are important. Classifi-
                             cation trees can yield complex decision boundaries, and they are appropriate
                             for ordered data, categorical data or a mixture of the two types. In this book,
                            © 2002 by Chapman & Hall/CRC
   348   349   350   351   352   353   354   355   356   357   358