Page 94 - Introduction to Statistical Pattern Recognition
P. 94

76                         Introduction to Statistical Pattern Recognition





                                                                                  (3.75)

                                                    l-(n+l)/N  -  2n
                                               2n
                                    Var(C,} = -              --                   (3.76)
                                                             -
                                             (N-1)2   1+1IN    (N-1)2  .
                      Because 6 of  (3.71) is  l/(N-1)  times the  distance, (3.75) and  the right-most
                      term of (3.76) correspond to (3.60) and (3.61) respectively.
                           Thus, the test of  normality may  be performed in  the following two lev-
                      els.
                      (1)  Compute the  sample variance of  6  of  (3.71),  and  check  whether it  is
                           close to (3.76) or not.  When N>>n, 2r~4N-l)~ may be used to approxi-
                           mate (3.76).

                      (2)   Plot the empirical distribution function of  C by  using  ((XI), , . . ,l,(XN)
                           and  the  theoretical  distribution  function  from  (3.73),  and  apply  the
                           Kolmogorov-Smirnov test [IO].


                           Variable transformation: When  variables are causal (i.e.  positive), the
                      distribution of  each variable may be approximated by a gamma density.  In this
                      case,  it  is  advantageous to  convert  the  distribution  to  a  normal-like  one  by
                      applying a transformation such as


                                            y  =xv   (0 < v < l),                 (3.77)

                      which  is  called  the power  transformation.  The  normal-like  is  achieved  by
                      making y of  (3.54), E((Y-~)~ I  - E2((y-Y)*}, close to  2 under the condition
                      thatE((~-?)~]  = 1, wherey=E{y).
                           Assuming a gamma density function of  (2.54) for x,  let us compute the
                      moments of y as







                                                                                  (3.78)


                      Therefore,
   89   90   91   92   93   94   95   96   97   98   99