Page 276 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R

P. 276

6.6 Classifier Evaluation 257

Resubstitution method
The whole set S is used for design, and for testing the classifier. As a consequence
of the non-independence of design and test sets, the method yields, on average, an
optimistic estimate of the error, E[ eP ˆ d ( ) n ], mentioned in section 6.3.3. For the
two-class linear discriminant with normal distributions an example of such an
estimate for various values of n is plotted in Figure 6.15 (lower curve).

Holdout method
The available n samples of S are randomly divided into two disjointed sets
(traditionally with 50% of the samples each), S d and S t used for design and test,
respectively. The error estimate is obtained from the test set, and therefore, suffers
from the bias and variance effects previously described. By taking the average over
many partitions of the same size, a reliable estimate of the test set error,
E[ eP ˆ t () n ], is obtained (see section 6.3.3). For the two-class linear discriminant
with normal distributions an example of such an estimate for various values of n is
plotted in Figure 6.15 (upper curve).

Partition methods
Partition methods, also called cross-validation methods divide the available set S
into a certain number of subsets, which rotate in their use of design and test, as
follows:

1. Divide S into k > 1 subsets of randomly chosen cases, with each subset
having n/k cases.
2. Design the classifier using the cases of k – 1 subsets and test it on the
remaining one. A test set estimate Pe ti is thereby obtained.

3. Repeat the previous step rotating the position of the test set, obtaining
thereby k estimates Pe ti.
4. Compute the average test set estimate Pe t = ∑ k = i 1 Pe ti k / and the variance
of the Pe ti.

This is the so-called k-fold cross-validation. For k = 2, the method is similar to
the traditional holdout method. For k = n, the method is called the leave-one-out
method, with the classifier designed with n – 1 samples and tested on the one
remaining sample. Since only one sample is being used for testing, the variance of
the error estimate is large. However, the samples are being used independently for
design in the best possible way. Therefore the average test set error estimate will
be a good estimate of the classifier error for sufficiently high n, since the bias
contributed by the finiteness of the design set will be low. For other values of k,
there is a compromise between the high bias-low variance of the holdout method,
and the low bias-high variance of the leave-one-out method, with less
computational effort.

271 272 273 274 275 276 277 278 279 280 281