Page 121 -
P. 121

108    4 Statistical Classification




















                                 Figure 4.26.  Two-class linear discriminant E[hd (n)] and E[&, (n)] curves, for
                                 d=7  and  S2=3, below  and  above  the  dotted  line,  respectively.  The  dotted  line
                                 represents the Bayes error (0.193).





                                   For precise criteria concerning the deviation of  the expected values of  Fed (n)
                                 and  Fe, (n) from Pe, the magnitude of  the standard deviations, and therefore the
                                 95% confidence interval of the estimates, it is advisable to use the PRSize program.
                                 If  the patterns are not equally distributed by  the classes it is advisable to use the
                                 smaller number of  patterns per class as value of n. Notice also that a multi-class
                                 problem with absolute separation of the classes can be seen as a generalization of a
                                 two-class  problem  (see  section  2.1.2). Therefore,  the  total  number  of  needed
                                 training  samples, for a given deviation  of  the expected  error  estimates  from the
                                 Bayes  error  can  be  estimated  as  cn*, where  n* is  the  particular  value  of  n  that
                                 achieves  such  a deviation  in  the  most  unfavourable  two-class dichotomy  of  the
                                 multi-class  problem.  If  a  hierarchical  approach  is  followed,  one  can  use  the
                                 estimate (c-l)n* instead.



                                 4.3  Model-Free Techniques

                                 The classifiers presented in the previous sections assumed particular shapes of the
                                 pattern clusters and sometimes also particular distributions of the feature vectors.
                                 Briefly,  a certain  model  of  the  distribution  of  the  feature vectors  in  the feature
                                 space was assumed. In  the present section we will present three important model-
                                 free techniques to design classifiers. These methods do not make any assumptions
                                 about the  underlying pattern distributions. They  are often called non-parametric
                                 methods, however, at least some of  them could be better  called semi-parametric.
                                 Although  all  of  these  methods  are  model-free,  their  tuning  to  the  particular
                                 distributions of the feature vectors is still based on statistical considerations.
   116   117   118   119   120   121   122   123   124   125   126