Page 348 - Computational Statistics Handbook with MATLAB
P. 348

Chapter 9: Statistical Pattern Recognition                      337


                                muver = mean(versicolor);
                                covver = cov(versicolor);
                                % Those remain the same for the following.
                                for i = 1:nvir
                                    % Get the test point and training set.
                                    virtrain = virginica;
                                    x = virtrain(i,:);
                                    virtrain(i,:)=[];
                                    muvir = mean(virtrain);
                                    covvir = cov(virtrain);
                                    pxgver = csevalnorm(x,muver,covver);
                                    pxgvir = csevalnorm(x,muvir,covvir);
                                    if pxgvir > pxgver
                                      % then we correctly classified it
                                      ncc = ncc+1;
                                    end
                                end

                             Finally, the probability of correct classification is estimated using
                                pcc = ncc/n;
                             The estimated probability of correct classification for the iris data using
                             cross-validation is 0.68.




                                                 ar
                                                  raacter
                                                                      e
                                               Ch
                                                ha
                                     OperatingOperating
                                                                  CC urvurv
                                                       ii
                                                         ti ic(Ric(R
                                                         ic
                                                                  C
                                                              )
                                                              OO C)C)
                             Receivceive  er eerr r  OperatingOperatingC  CC hhaarr aactercter cter i  isst sstt c(R(RO  OC  C)  C  urv  urve  ee
                             Re
                             RReeceivceiv
                             We now turn our attention to how we can use cross-validation to evaluate a
                             classifier that uses the likelihood approach with varying decision thresholds
                               . It would be useful to understand how the classifier performs for various
                             τ C
                             thresholds (corresponding to the probability of false alarm) of the likelihood
                             ratio. This will tell us what performance degradation we have (in terms of
                             correctly classifying the target class) if we limit the probability of false alarm
                             to some level.
                              We start by dividing the sample into two sets: one with all of the target
                             observations and one with the non-target patterns. Denote the observations
                             as follows
                                                   1 ()
                                                                   (
                                                 x  i  ⇒  Target pattern  ω )
                                                                     1
                                                   2 ()
                                                 x i ⇒  Non-target pattern  ω 2 ).
                                                                      (
                                                                                          denote
                             Let n 1   represent the number of target observations (class ω 1  ) and n 2
                                                           ) patterns. We work first with the non-tar-
                             the number of non-target (class ω 2
                             get observations to determine the threshold we need to get a desired proba-
                            © 2002 by Chapman & Hall/CRC
   343   344   345   346   347   348   349   350   351   352   353