Page 192 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 192
EXERCISES 181
withheld and a classifier is trained using all the data from the remaining
L 1 subsets. This classifier is tested using T(1) as validation data,
^
E
yielding an estimate E(1). This procedure is repeated for all other sub-
^
E
sets, thus resulting in L estimates E(‘). The last step is the training of the
final classifier using all available data. Its error rate is estimated as the
^
E
average over E(‘). This estimate is a little pessimistic especially if L is small.
The leave-one-out method is essentially the same as the cross-
validation method except that now L is as large as possible, i.e. equal
to N S (the number of available samples). The bias of the leave-one-out
method is negligible, especially if the training set is large. However, it
requires the training of N S þ 1 classifiers and it is computationally very
expensive if N S is large.
5.5 REFERENCES
Bishop, C.M., Neural Networks for Pattern Recognition, Oxford University Press,
Oxford, UK, 1995.
Duda, R.O., Hart, P.E. and Stork, D.G., Pattern Classification, Wiley, London, UK, 2001.
Devijver, P.A. and Kittler, J., Pattern Recognition, a Statistical Approach, Prentice-Hall,
London, UK, 1982.
Haykin, S., Neural Networks – a Comprehensive Foundation, 2nd ed., Prentice-Hall,
Upper Saddle River, NJ, 1998.
Hertz, J., Krogh, A. and Palmer, R.G., Introduction to the Theory of Neural Computa-
tion, Addison-Wesley, Reading, MA, 1991.
Ripley, B.D., Pattern Recognition and Neural Networks, Cambridge University Press,
Cambridge, UK, 1996.
Vapnik, V.N., Statistical Learning Theory, Wiley, New York, 1998.
5.6 EXERCISES
1. Prove that if C is the covariance matrix of a random vector z, then
1
C
N
is the covariance matrix of the average of N realizations of z. (0)
2. Show that (5.16) and (5.17) are equivalent. ( )
3. Prove that, for the two-class case, (5.50) is equivalent to the Fisher linear discriminant
(6.52). ( )
4. Investigate the behaviour (bias and variance) of the estimators for the conditional
probabilities of binary measurements, i.e. (5.20) and (5.22), at the extreme ends. That
N
N
is, if N k << 2 and N k >> 2 .( )