Page 283 - Introduction to Statistical Pattern Recognition
P. 283
6 Nonparametric Density Estimation 265
various functions. Proofs are not given but can be easily obtained by the
reader.
(1) p,(Z) = I O I-lpx(X) [Jacobian] , (6.45)
(2) V?p,(Z) = I O I -‘@-I V*p,(X)@T-’
[from (6.10),(6.42), and (6.45)] , (6.46)
(3) r(Z) = r-(X) [from (6.44)] , (6.47)
(4) v(Z) = IO Iv(X) [from (6.25),(6.43), and (6.47)] , (6.48)
(5) MSE {pz(Z) 1 = I O I-?MSE { px(X) 1 [from (6.32) and (6.45)] , (6.49)
(6) IMSE, = I @ I -I IMSEx [from (6.33) and (6.42)] . (6.50)
Note that both MSE and IMSE depend on @. The mean-square error is a coor-
dinate dependent criterion.
Minimization of IMSE: We will now use the above results to optimize
the integral mean-square error criterion with respect to the matrix A. However,
it is impossible to discuss the optimization for a general p(X). We need to
limit the functional form of p(X). Here, we choose the following form for
P (X):
p(X) = IB I~”’x((X-M)7B-I(X-M)) , (6.5 I)
where x(.) does not involve B or M. The p(X) of (6.51) covers a large family
of density functions including the ones in (6.3). The expected vector, M, can
be assumed to be zero, since all results should be independent of a mean shift.
Now, we still have the freedom to choose the matrix A in some optimum
manner. We will manipulate the two matrices B and A to simultaneously diag-
onalize each, thus making the analysis easier. That is,
Q7BQ = I and @‘A0 = A (6.52)
and
p (Z) = R (Z’Z) 3 (6.53)
where A is a diagonal matrix with components h,, . . . ,A,!,