Page 159 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 159
148 SUPERVISED LEARNING
probabilities, we need to specify one extra parameter, N S , which is the
number of samples.
Intuitively, the following estimator is appropriate:
^ N k
P Pð! k Þ¼ ð5:18Þ
N S
^
The expectation of N k equals N S P(! k ). Therefore, P(! k ) is an unbiased
P
estimate of P(! k ). The variance of a multinomial distributed variable is
N S P(! k )(1 P(! k )). Consequently, the variance of the estimate is:
^
P
Var½Pð! k Þ ¼ Pð! k Þð1 Pð! k ÞÞ ð5:19Þ
N S
This shows that the estimator is consistent. That is if N S !1, then
^
Var[P(! k )] ! 0. The required number of samples follows from the
P
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
^
P
constraint that Var[P(! k )] << P(! k ). For instance, if for some class
we anticipate that P(! k ) ¼ 0:01, and the permitted relative error is 20%,
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
^
P
i.e. Var[P(! k )] ¼ 0:2P(! k ), then N S must be about 2500 in order to
obtain the required precision.
5.2.5 Binary measurements
Another example of a multinomial distribution occurs when the meas-
urement vector z can only take a finite number of states. For instance,
if the sensory system is such that each element in the measurement
vector is binary, i.e. either ‘1’ or ‘0’, then the number of states the
N
vector can take is at most 2 . Such a binary vector can be replaced
with an equivalent scalar z that only takes integer values from 1 up to
N
2 . The conditional probability density p(zj! k ) turns into a probability
function P(zj! k ). Let N k (z) be the number of samples in the training
set with measurement z and class ! k . N k (z) has a multinomial
distribution.
At first sight, one would think that estimating P(zj! k ) is the same type
of problem as estimating the prior probabilities such as discussed in the
previous section: