Page 160 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 160

NONPARAMETRIC LEARNING 149

^ N k ðzÞ
P Pðzj! k Þ¼ ð5:20Þ
N k

ð
^
P
Var½Pðzj! k Þ ¼ Pðzj! k Þ 1 Pðzj! k ÞÞ ð5:21Þ
N k
For small N and a large training set, this estimator indeed suffices.
However, if N is too large, the estimator fails. A small example demon-
strates this. Suppose the dimension of the vector is N ¼ 10. Then the
3
total number of states is 2 10 10 . Therefore, some states will have a
3
probability of less than 10 . The uncertainty of the estimated probabil-
4
ities must be a fraction of that, say 10 . The number of samples, N k ,
5
needed to guarantee such a precision is on the order of 10 or more.
5
Needless to say that in many applications 10 samples is much too
expensive. Moreover, with even a slight increase of N the required
number of samples becomes much larger.
One way to avoid a large variance is to incorporate more prior know-
ledge. For instance, without the availability of a training set, it is known
beforehand that all parameters are bounded by 0 P(zj! k ) 1. If noth-
ing further is known, we could first ‘guess’ that all states are equally
likely: P(zj! k ) ¼ 2 N . Based on this guess, the estimator takes the form:

^ N k ðzÞþ 1
P Pðzj! k Þ¼ ð5:22Þ
N k þ 2 N

The variance of the estimate is:

ð
^ N k Pðzj! k Þ 1 Pðzj! k ÞÞ
P
Var½Pðzj! k Þ ¼ 2 ð5:23Þ
N
ð N k þ 2 Þ
Comparing (5.22) and (5.23) with (5.20) and (5.21) the conclusion is
that the variance of the estimate is reduced at the cost of a small bias. See
also exercise 4.

5.3 NONPARAMETRIC LEARNING

Nonparametric methods are learning methods for which prior knowledge
about the functional form of the conditional probability distributions is

155 156 157 158 159 160 161 162 163 164 165