Page 202 - Introduction to Statistical Pattern Recognition
P. 202

184                        Introduction to Statistical Pattern Recognition




                                 E{i) Gf+v  g(N) ,                                  (5.6)
                       where v = tr (a2ffaY2 K@(?))}/2 is  independent of N and  treated as a  con-
                       stant determined by  the underlying problem.  This leads to the following pro-
                       cedure to estimate f.

                                                                                       A
                            Change the  sample size N as N1 ,N2,. . . ,Nt.  For each Ni, compute Y
                                           .*
                            and  subsequently  f  empirically.  Repeat  the  experiment  z  times,  and
                                         n
                            approximate  E { f }  by the sample mean of the z experimental results.

                            Plot these empirical points E { i} vs.  g (N).  Then, find the line best fitted
                            to these points.  The slope of  this line is v  and the y-cross point is the
                            improved estimate off.  There are many  possible  ways  of  selecting a
                            line.  The standard procedure would be  the minimum mean-square error
                            approach.




                       Parametric Formulation

                            Moments  of  parameters:  In  the  parametric  approach,  most  of  the
                       expressions we  would  like  to estimate are functions of  expected vectors and
                       covariance matrices.  In this section, we will  show how the general discussion
                       of the previous section can be applied to this particular family of parameters.
                            Assume that N samples are drawn from each of  two n-dimensional nor-
                       mal distributions with expected vectors and covariance matrices given by

                                                                                    (5.7)

                       Without loss of  generality, any two covariance matrices can be simultaneously
                       diagonalized to I and A, and a coordinate shqt can bring the expected vector of
                       one class to zero.  Normality is assumed here for simplicity.  However, the dis-
                       cussion can  be  extended  to  non-normal  cases  easily.  The  extension will  be
                       presented at the end of  this section.  Also, N  = N2 = N  is assumed here.  For
                                a
                       NI # NZ, similar discussion could be developed, although the results are a
   197   198   199   200   201   202   203   204   205   206   207