Page 283 - Introduction to Statistical Pattern Recognition

P. 283

6 Nonparametric Density Estimation 265

various functions. Proofs are not given but can be easily obtained by the
reader.

(1) p,(Z) = I O I-lpx(X) [Jacobian] , (6.45)

(2) V?p,(Z) = I O I -‘@-I V*p,(X)@T-’

[from (6.10),(6.42), and (6.45)] , (6.46)

(3) r(Z) = r-(X) [from (6.44)] , (6.47)

(4) v(Z) = IO Iv(X) [from (6.25),(6.43), and (6.47)] , (6.48)

(5) MSE {pz(Z) 1 = I O I-?MSE { px(X) 1 [from (6.32) and (6.45)] , (6.49)

(6) IMSE, = I @ I -I IMSEx [from (6.33) and (6.42)] . (6.50)

Note that both MSE and IMSE depend on @. The mean-square error is a coor-
dinate dependent criterion.

Minimization of IMSE: We will now use the above results to optimize
the integral mean-square error criterion with respect to the matrix A. However,
it is impossible to discuss the optimization for a general p(X). We need to
limit the functional form of p(X). Here, we choose the following form for
P (X):

p(X) = IB I~”’x((X-M)7B-I(X-M)) , (6.5 I)
where x(.) does not involve B or M. The p(X) of (6.51) covers a large family
of density functions including the ones in (6.3). The expected vector, M, can
be assumed to be zero, since all results should be independent of a mean shift.
Now, we still have the freedom to choose the matrix A in some optimum
manner. We will manipulate the two matrices B and A to simultaneously diag-
onalize each, thus making the analysis easier. That is,
Q7BQ = I and @‘A0 = A (6.52)
and

p (Z) = R (Z’Z) 3 (6.53)

where A is a diagonal matrix with components h,, . . . ,A,!,

278 279 280 281 282 283 284 285 286 287 288