Page 94 - Introduction to Statistical Pattern Recognition
P. 94
76 Introduction to Statistical Pattern Recognition
(3.75)
l-(n+l)/N - 2n
2n
Var(C,} = - -- (3.76)
-
(N-1)2 1+1IN (N-1)2 .
Because 6 of (3.71) is l/(N-1) times the distance, (3.75) and the right-most
term of (3.76) correspond to (3.60) and (3.61) respectively.
Thus, the test of normality may be performed in the following two lev-
els.
(1) Compute the sample variance of 6 of (3.71), and check whether it is
close to (3.76) or not. When N>>n, 2r~4N-l)~ may be used to approxi-
mate (3.76).
(2) Plot the empirical distribution function of C by using ((XI), , . . ,l,(XN)
and the theoretical distribution function from (3.73), and apply the
Kolmogorov-Smirnov test [IO].
Variable transformation: When variables are causal (i.e. positive), the
distribution of each variable may be approximated by a gamma density. In this
case, it is advantageous to convert the distribution to a normal-like one by
applying a transformation such as
y =xv (0 < v < l), (3.77)
which is called the power transformation. The normal-like is achieved by
making y of (3.54), E((Y-~)~ I - E2((y-Y)*}, close to 2 under the condition
thatE((~-?)~] = 1, wherey=E{y).
Assuming a gamma density function of (2.54) for x, let us compute the
moments of y as
(3.78)
Therefore,