Page 140 -

P. 140

4.5 Classifier Evaluation 127

Influence of finite design set

The estimate of the design set error will depend on the particular sample
distributions in both classes. For normal distributions, the design set error is
influenced by the deviation of the sample means and covariances, computed with n
design samples, from the true values, resulting in:
v
E [ped (n)] z Pe + - ; (4-46a)
n

Therefore, the variance is zero, but there is a bias v / n , where u is constant for
the same classifier and n is the number of design samples used. For the linear
normal classifier the bias is approximately proportional to dln. For the quadratic
normal classifier the bias is approximately proportional to d2/n, therefore it grows
quite fast with d. This makes the quadratic classifier more sensitive to parameter
estimation errors than the linear one.

When influences from both the finite design set and the finite test set are taken
into account, it is verified that the bias is only influenced by the design set as stated
in (4-46a), and the variance is given by:

Pe2 (4 - Pe2 (4 + v[~e, (n)] . (4-47)
n2

The last term on the right hand side is nearly zero for the linear classifier. The
variance is thus dominated by the first two terms. These are influenced by the bias
of the design set. However, this influence is minimal and can be neglected.
Briefly:

- The bias is predominantly influenced by the finiteness of the design set;
- The variance is predominantly influenced by the finiteness of the test set.

In normal practice we only have a pattern set X with n samples available. The
problem arises of how to divide the available patterns into design set and test set.
The following alternatives are possible:

Resubstitution method

The whole set X is used for design, and also for testing the classifier. As a
consequence of the non-independence of design and test sets, the method yields, on
average, an optimistic estimate of the error, corresponding to the estimate
E[ ped(n)] mentioned in section 4.2.4. For the two-class linear discriminant with

135 136 137 138 139 140 141 142 143 144 145