Page 200 - Introduction to Statistical Pattern Recognition

P. 200

182 Introduction to Statistical Pattern Recognition

the classification error, and so on. Therefore, we need to know how the out-
puts of these functions are affected by the random variations of parameters.
More specifically, we are interested in the biases and variances of these func-
tions. They depend on the functional form as well as the number of samples
used to estimate the parameters. We will discuss this subject in this chapter.
First, the problem will be addressed in a general form, and then the Bhatta-
charyya distance will be studied.

A more important quantity in pattern recognition is the probabi/ify of
error, which is expressed as a complicated function of two sets of parameters:
one is the set of parameters which specify a classifier, and the other is the set
of parameters which specify the distributions to be tested. Because these two
sets are involved, the estimation of the error is complex and difficult to discuss.
In this chapter, we will show how the estimated error is affected by the design
and test samples. Also, the discussion is extended to include several error esti-
mation techniques such as the holdout, leave-one-out, and resubstitution
methods as well as the bootsrr-ap method.

5.1. Effect of Sample Size in Estimation

General Formulation

Expected value and variance: Let us consider the problem of estimat-
,. ,.
ing f(y I,. . . ,y4) by f(y,, . . . ,y4), where f is a given function, the yj’s are
A
the true parameter values, and the yI’s are their estimates. In this section, we
n ,.
will derive expressions for the expected value and variance off (yl,. . . ,y4),
and discuss a method to estimate f ( y I, . . . ,y4).
,. ,.
Assuming that the deviation of yi from yI is small, f (Y) can be
expanded by a Taylor series up to the second order terms as

A , .
where Y = bl . . . y,,lT, Y = [yI . . . ;,Ir, and AY = - Y.

195 196 197 198 199 200 201 202 203 204 205