Page 249 - Introduction to Statistical Pattern Recognition

P. 249

5 Parameter Estimation 23 1

Example 1: Let f be
1
1
~(x,M,x) -(x-M~x-'(x-M) + - In 1x1 . (5.143)
=
2 2
Then,

af'
-- - -C-'(x-M) , (5.144)
aM
a'
-- - 1[Z-'-r-'(X-M)(X-M)7z-'] [from (A.41)-(A.46)] . (5.145)
ax 2

If a sample Y is excluded, of (5.142) becomes

1
h(X,Y) = - [((x-M)~z-1(Y-M)p + n + 2(x-M)7x-'(Y-M)
2N

- (X-M)TZ-'(X-M) - (Y-M)TX-'(Y-M)] . (5.146)

Example 2: Iff is evaluated at X = Y, h of (5.146) becomes
1
h(Y,Y) = -[d4(Y) + n] , (5.147)
2N
where d2(Y) = (Y-M)TX-'(Y-M). Equation (5.147) is the same as (5.135)
except that the true parameters M and Z are used this time instead of Gj and i,

for (5.135).

Resubstitution Error for the Quadratic Classifier

Error expression: When the L method is used, design and test samples
are independent. Therefore, when the expectation is taken on the classification
error of (5.98), we can isolate two expectations: one with respect to design
samples and the other with respect to test samples. In (5.101), the randomness
of h comes from design samples, and X is the test sample, independent of the
design samples. Therefore, Ed [ doh(') } can be computed for a given X. The
expectation with respect to test samples is obtained by computing j[.]pj(X) dX.
On the other hand, when the R method is used, design and test simples are no
longer independent. Thus, we cannot isolate two expectations. Since X is a

244 245 246 247 248 249 250 251 252 253 254