Page 142 -
P. 142
4.5 Classifier Evaluation 129
the expected value of this error estimate is similar to the one obtained with the
leave-one-out method, with a variance similar to the one obtained with the
resubstitution method. The bootstrap method combines, therefore, the best qualities
of both methods.
Statistical software products such as SPSS and Statistica allow the selection of
the cases used for training and for testing linear discriminant classifiers. With SPSS
it is possible to use a selection variable, easing the task of specifying randomly
selected samples. With Statistica, one can initially select the cases used for training
(Select option in the toolbar Options menu), and once the classifier is designed,
specify test cases (Select Cases button in the results form).
For the two-class cork stoppers classifier, with two features, presented in section
4.1.3 (classification matrix shown in Figure 4.9), using a partition method with
k=3, a test set estimate of Pe,= 9.9 % was obtained, which is near the training set
error estimate of 10%. The leave-one-out method also produces Pel = 10 %. The
closeness of these figures is an indication of reliable error estimation.
It is also possible to assess whether there is a significant difference between test
set and design set estimates of the class errors by using a standard statistical test
based on 2x2 contingency tables.
For this purpose let us denote:
n~: number of design patterns;
n,: number of test patterns;
kd: number of wrongly classified patterns in the design set;
kt: number of wrongly classified patterns in the test set.
Let us now compute the following quantity:
Then, provided that nd, nt, nd - kd, n, - k, are all greater than 5, the quantity a has
a chi-square distribution with one degree of freedom. The test must be applied to
the classes individually, unless the same number of patterns and error rates occur.
Let us see how this works for the cork stoppers classification with errors estimated
by the previous partition method, with np67 patterns for design and n~33 patterns
patterns of the
for testing. For class y, in one run of the partition method k ~ 6
design set were misclassified and, kt=3 patterns of the test set were misclassified.
The value of a=0.00017 is therefore obtained, which, looking at the chi-square
tables, indicates a non-significant difference at a 95% confidence level.
When presenting error estimates, it is convenient to also present the respective
confidence intervals. For the two-class cork stoppers classifier, a 95 % confidence
interval of [4%, 16%] is obtained using formula (4-30). As already discussed in
section 4.2.4, this formula usually yields intervals that are too large. More realistic
intervals can be obtained using the variance of the Pet, computed by a partition
method for a reasonable number of partitions (say, above 5).