Page 115 -

P. 115

102 4 Statistical Classification

Let us illustrate this issue using the Norm2c2d dataset (see Appendix A). The
theoretical error for this two-class, two-dimensional, dataset is:

The training set error estimate for this dataset is 5%. By introducing deviations
of fO.l into the values of the transforming matrix A of this dataset, with
corresponding deviations between 15% and 42% of the covariance values, training
set errors of 6% were obtained, a mild deviation from the previous 5% error rate
for the equal covariance situation (see Exercise 4.9).

Figure 4.22. Partial listing of the posterior probabilities for two classes of cork
stoppers.

Let us go back to the cork stoppers classification problem using two features, N
and PRT, with equal prevalences. The classification matrix is shown in Figure 4.8.
Note that statistical classifiers are, apart from numerical considerations, invariant
to scaling operations, therefore the same results are obtained using either PRT or
PRT 10.
A partial listing of the posterior probabilities, useful for spotting classification
errors, is shown in Figure 4.22.
The covariance matrices are shown in Table 4.2. The deviations of the
covariance matrices elements compared with the central values of the pooled
matrix are between 5 and 30%. The cluster shapes are also similar. Therefore, there
are good reasons to believe that the designed classifier is near the optimum one.

110 111 112 113 114 115 116 117 118 119 120