Page 345 - Computational Statistics Handbook with MATLAB
P. 345

334                        Computational Statistics Handbook with MATLAB


                                load iris
                                % This loads up three matrices:
                                % setosa, versicolor and virginica.
                                % We will use the versicolor and virginica.
                                % To make it interesting, we will use only the
                                % first two features.
                                % Get the data for the training and testing set. We
                                % will just pick every other one for the testing set.
                                indtrain = 1:2:50;
                                indtest = 2:2:50;
                                versitest = versicolor(indtest,1:2);
                                versitrain = versicolor(indtrain,1:2);
                                virgitest = virginica(indtest,1:2);
                                virgitrain = virginica(indtrain,1:2);
                             We now build the classifier by estimating the class-conditional probabilities.
                             We use the parametric approach, making the assumption that the class-con-
                             ditional densities are multivariate normal. In this case, the estimated priors
                             are equal.

                                % Get the classifier. We will assume a multivariate
                                % normal model for these data.
                                muver = mean(versitrain);
                                covver = cov(versitrain);
                                muvir = mean(virgitrain);
                                covvir = cov(virgitrain);
                             Note that the classifier is obtained using the training set only. We use the test-
                             ing set to estimate the probability of correctly classifying observations.

                                % Present each test case to the classifier. Note that
                                % we are using equal priors, so the decision is based
                                % only on the class-conditional probabilities.
                                % Put all of the test data into one matrix.
                                X = [versitest;virgitest];
                                % These are the probability of x given versicolor.
                                pxgver = csevalnorm(X,muver,covver);
                                % These are the probability of x given virginica.
                                pxgvir = csevalnorm(X,muvir,covvir);
                                % Check which are correctly classified.
                                % In the first 25, pxgver > pxgvir are correct.
                                ind = find(pxgver(1:25)>pxgvir(1:25));
                                ncc = length(ind);
                                % In the last 25, pxgvir > pxgver are correct.
                                ind = find(pxgvir(26:50) > pxgver(26:50));
                                ncc = ncc + length(ind);
                                pcc = ncc/50;

                            © 2002 by Chapman & Hall/CRC
   340   341   342   343   344   345   346   347   348   349   350