Page 32 -
P. 32
18 1 Basic Notions
there is interdependence of the solutions adopted at each unit level. For instance,
the type of pattern acquisition used may influence the choice of features, and
therefore the other units as well. Other influences are more subtle: for instance, the
type of pre-processing performed on the features inputted to a neural net may
influence the overall performance in a way that is difficult to foresee.
A PR project has to consider all the mentioned tasks and evolves in a schematic
way through the phases shown in Figure 1.13.
1.5.2 Training and Testing
As mentioned in the previous section the development of a PR application starts
with the evaluation of the type of features to be used and the adequate PR approach
for the problem at hand. For this purpose an initial set of patterns is usually
available. In the supervised approaches this initial set, represented by n d-
dimensional feature vectors or n strings built with d primitives, is used for
developing the PR kernel. It constitutes the training set.
The performance of a PR system is usually evaluated in terms of error rates for
all classes and an overall error rate. When this performance evaluation is based on
the patterns of the training set we obtain, on average, optimistic figures. This
somewhat intuitive result will be further clarified in later chapters. In order to
obtain better estimates of a PR system performance it is indispensable to evaluate it
using an independent set of patterns, is., patterns not used in its design. This
independent set of patterns is called a test set. Test set estimates of a PR system
performance give us an idea of how well the system is capable of generalizing its
recognition abilities to new patterns.
For classification and regression systems the degree of confidence we may have
on estimates of a PR system performance, as well as in its capability of
generalization, depends strongly on the n/d ratio, the dimensionality ratio.
1.5.3 PR Software
There are many software products for developing PR applications, which can guide
the design of a PR system from the early stages of the specifications until the final
evaluation. A mere search through the Internet will disclose many of these
products and tools, either freeware, shareware or commercial. Many of these are
method-specific, for instance in the neural networks area. Generally speaking, the
following types of software products can be found:
1. Tool libraries (e.g. in C) for use in the development of applicative software.
2. Tools running under other software products (e.g. Microsofr Excel or The
Math Works Matlab).
3. Didactic purpose products.
4. Products for the design of PR applications using a specific method.
5. Products for the design of PR applications using a panoply of different methods.