Page 155 -
P. 155

142    4 Statistical Classification


                               Narendra  P,  Fukunaga  K  (1977)  A  Branch  and  Bound  Algorithm  for  Feature  Subset
                                 Selection. IEEE Tr Comp 26:9I7-922.
                               Niemann H (1990) Pattern Analysis and Understanding.  Springer Verlag.
                               Pipes LA (1977) Matrix-Computer Methods in Engineering. Krieger Pub. Co.
                               Raudys  S,  Pikelis  V  (1980)  On  dimensionality,  sample  size,  classification  error  and
                                 complexity of  classitication algorithm  in  pattern  recognition.  IEEE Tr Patt  Anal  Mach
                                 Intel 2:242-252.
                               Schalkoff  R (1992) Pattern Recognition. Wiley, New York.
                               Swets JA (1973) The Relative  Operating Characteristic in  Psychology. Science,  182:990-
                                 1000.
                               Shapiro SS, Wilk SS, Chen SW (1968) A comparative study of various tests for normality. J
                                 Am Stat Ass, 63:1343-1372.
                               Siegel  S, Castellan NJ  Jr (1988) Nonparametric  Statistics for the  Behavioral  Sciences. Ms
                                 Graw Hill, New York.
                               Specht DF (1990) Probabilistic  Neural Networks. Neural Networks, 3: 109-1 18.
                               Swain  PH (1977) The decision  tree classifier:  Design and potential.  IEEE Tr Geosci Elect,
                                 15:142-147.
                               Toussaint GT (1974) Bibliography on Estimation of Misclassification.  IEEE Tr Info Theory,
                                 20:472-479.



                               Exercises

                               4.1  Consider the first two classes of the Cork Stoppers dataset, described by  features ART
                                   and PRT.
                                   a)  Determine  the  Euclidian  and  Mahalanobis classifiers  using  feature  ART alone,
                                      then using both ART and PRT.
                                   b)  Compute  the  Bayes  error  using  a  pooled  covariance  estimate  as  the  true
                                      covariance for both classes.
                                   c)  Determine whether the Mahalanobis classifiers are expected to be near the optimal
                                      Bayesian classifier.
                                   d)  Using PR Size determine the average deviation of  the training  set error estimate
                                      from the Bayes error, and the 95% confidence interval of the error estimate.
                                   e)  Determine the classification of one cork stopper using the correlation approach.

                               4.2  Consider the first two classes of the Cork Stoppers  dataset, dcscribed by  features ART
                                   and  PRT.  Compute  the  linear  discriminant  corresponding  to  the  Euclidian  classifier
                                   using formula 4-3c.
                               4.3  Repeat the previous exercises for the three classes of the Cork Stoppers dataset, using
                                   features N, PRM and ARTG. Compute the pooled covariance matrix and determine the
                                   influence of small changes in its values on the classifier performance.

                               4.4  Consider the problem of classifying cardiotocograms (CTG dataset) into three classes:
                                   N (normal), S (suspect) and P (pathological).
                                   a)  Determine  which  features  are  most  discriminative  and  appropriate  for  a
                                       Mahalanobis classifier approach for this problem.
   150   151   152   153   154   155   156   157   158   159   160