Page 93 -
P. 93

80     4 Statislical Classification


                                As seen in previous chapters, it seems reasonable to take those sample means as
                             class prototypes and assign each cork stopper to its nearest prototype. This is the
                             essence  of  what  is  called  the  minimum  distance  or  template  matching
                             classzfication. The classification rule is then:


                                If  11  x - [55.28]11< 11  x - [79.74]11  then  x E w, else  x E w2 '

                                We assume, for the moment, that a Euclidian metric is used. Using the value at
                              half distance from the means, we can rewrite (4-1) as:

                                If   x < 67.51  then   XE al else   XE a*.                  (4- 1 a)

                                The separating "hyperplane" is simply the point 67.51. Note that in the equality
                              case (x=67.51) the class assignment is arbitrary (a is a possibility, as in 4-la).
                                Let us now evaluate the performance of this simple classifier by computing the
                              error  rate  in  the  training  set  of  n=50  cases  per  class.  Figure  4.2  shows  the
                              classification  matrix  (obtained  with  Statistics)  with  the  predicted  classifications
                              along the columns and the true (observed) classifications along the rows.














                              Figure 4.2. Classification matrix  of  two classes of cork stoppers using  only one
                              feature, N. Both classes have equal probability  of occurrence ("p=.5" in the listed
                              classification matrix).



                                We  see  that  for  this  simple  classifier  the  overall  percentage  of  correct
                              classification  in  the  training  set  is  77%,  or equivalently,  the  overall  training  set
                              error is 23% (18% for wl and 28% for Q).  For the moment we will not assess how
                              the classifier performs with independent patterns, i.e., we will not assess its test set
                              error.
                                Let us now use one more feature: PRTlO = PRTIIO. The feature vector is:



                              I  We assume an underlying real domain for the ordinal feature N. Conversion to an ordinal
                                will be performed when needed. For instance, the practical threshold of  the classifier in
                                (4-la) would be 68.
   88   89   90   91   92   93   94   95   96   97   98