Page 32 -
P. 32

18     1 Basic Notions


        there  is  interdependence of  the solutions adopted at each  unit  level. For instance,
        the  type  of  pattern  acquisition  used  may  influence  the  choice  of  features,  and
        therefore the other units as well. Other influences are more subtle: for instance, the
        type  of  pre-processing  performed  on  the  features  inputted  to  a  neural  net  may
        influence the overall performance in a way that is difficult to foresee.
          A PR project has to consider all the mentioned  tasks and evolves in a schematic
        way through the phases shown in Figure  1.13.



        1.5.2 Training and Testing

        As mentioned  in  the previous  section  the development  of  a  PR  application  starts
        with the evaluation of the type of features to be used and the adequate PR approach
        for  the  problem  at  hand.  For  this  purpose  an  initial  set  of  patterns  is  usually
        available.  In  the  supervised  approaches  this  initial  set,  represented  by  n  d-
        dimensional  feature  vectors  or  n  strings  built  with  d  primitives,  is  used  for
        developing the PR kernel. It constitutes the training set.
          The performance of a PR system is usually evaluated in terms of error rates for
        all classes and an overall error rate. When this performance evaluation is based on
        the  patterns  of  the  training  set  we  obtain,  on  average,  optimistic  figures.  This
        somewhat  intuitive  result  will  be  further  clarified  in  later  chapters.  In  order  to
        obtain better estimates of a PR system performance it is indispensable to evaluate it
        using  an  independent  set  of  patterns,  is., patterns  not  used  in  its  design. This
        independent  set of  patterns  is called a test set. Test set estimates of  a PR  system
        performance give us  an  idea of  how well the system is capable of generalizing its
        recognition abilities to new patterns.
          For classification and regression  systems the degree of confidence we may have
        on  estimates  of  a  PR  system  performance,  as  well  as  in  its  capability  of
        generalization, depends strongly on the n/d ratio, the dimensionality ratio.


        1.5.3 PR Software

        There are many software products for developing PR applications, which can guide
        the design of  a PR system from the early stages of the specifications until  the final
        evaluation.  A  mere  search  through  the  Internet  will  disclose  many  of  these
        products  and  tools, either freeware,  shareware or commercial. Many  of  these are
        method-specific, for instance in the neural networks area. Generally speaking, the
        following types of software products can be found:

        1. Tool libraries (e.g. in C) for use in the development of applicative software.
        2. Tools  running  under  other  software  products  (e.g.  Microsofr  Excel  or  The
          Math Works Matlab).
        3. Didactic purpose products.
        4. Products for the design of PR applications using a specific method.
        5. Products for the design of PR applications using a panoply of different methods.
   27   28   29   30   31   32   33   34   35   36   37