Page 192 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 192

EXERCISES                                                    181

            withheld and a classifier is trained using all the data from the remaining
            L   1 subsets. This classifier is tested using T(1) as validation data,
                               ^
                               E
            yielding an estimate E(1). This procedure is repeated for all other sub-
                                          ^
                                          E
            sets, thus resulting in L estimates E(‘). The last step is the training of the
            final classifier using all available data. Its error rate is estimated as the
                       ^
                       E
            average over E(‘). This estimate is a little pessimistic especially if L is small.
              The leave-one-out method is essentially the same as the cross-
            validation method except that now L is as large as possible, i.e. equal
            to N S (the number of available samples). The bias of the leave-one-out
            method is negligible, especially if the training set is large. However, it
            requires the training of N S þ 1 classifiers and it is computationally very
            expensive if N S is large.


            5.5   REFERENCES


            Bishop, C.M., Neural Networks for Pattern Recognition, Oxford University Press,
             Oxford, UK, 1995.
            Duda, R.O., Hart, P.E. and Stork, D.G., Pattern Classification, Wiley, London, UK, 2001.
            Devijver, P.A. and Kittler, J., Pattern Recognition, a Statistical Approach, Prentice-Hall,
             London, UK, 1982.
            Haykin, S., Neural Networks – a Comprehensive Foundation, 2nd ed., Prentice-Hall,
             Upper Saddle River, NJ, 1998.
            Hertz, J., Krogh, A. and Palmer, R.G., Introduction to the Theory of Neural Computa-
             tion, Addison-Wesley, Reading, MA, 1991.
            Ripley, B.D., Pattern Recognition and Neural Networks, Cambridge University Press,
             Cambridge, UK, 1996.
            Vapnik, V.N., Statistical Learning Theory, Wiley, New York, 1998.


            5.6   EXERCISES


            1. Prove that if C is the covariance matrix of a random vector z, then
                                           1
                                            C
                                           N
              is the covariance matrix of the average of N realizations of z. (0)
            2. Show that (5.16) and (5.17) are equivalent. ( )
            3. Prove that, for the two-class case, (5.50) is equivalent to the Fisher linear discriminant
              (6.52). (  )
            4. Investigate the behaviour (bias and variance) of the estimators for the conditional
              probabilities of binary measurements, i.e. (5.20) and (5.22), at the extreme ends. That
                        N
                                    N
              is, if N k << 2 and N k >> 2 .(  )
   187   188   189   190   191   192   193   194   195   196   197