Page 160 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 160
NONPARAMETRIC LEARNING 149
^ N k ðzÞ
P Pðzj! k Þ¼ ð5:20Þ
N k
ð
^
P
Var½Pðzj! k Þ ¼ Pðzj! k Þ 1 Pðzj! k ÞÞ ð5:21Þ
N k
For small N and a large training set, this estimator indeed suffices.
However, if N is too large, the estimator fails. A small example demon-
strates this. Suppose the dimension of the vector is N ¼ 10. Then the
3
total number of states is 2 10 10 . Therefore, some states will have a
3
probability of less than 10 . The uncertainty of the estimated probabil-
4
ities must be a fraction of that, say 10 . The number of samples, N k ,
5
needed to guarantee such a precision is on the order of 10 or more.
5
Needless to say that in many applications 10 samples is much too
expensive. Moreover, with even a slight increase of N the required
number of samples becomes much larger.
One way to avoid a large variance is to incorporate more prior know-
ledge. For instance, without the availability of a training set, it is known
beforehand that all parameters are bounded by 0 P(zj! k ) 1. If noth-
ing further is known, we could first ‘guess’ that all states are equally
likely: P(zj! k ) ¼ 2 N . Based on this guess, the estimator takes the form:
^ N k ðzÞþ 1
P Pðzj! k Þ¼ ð5:22Þ
N k þ 2 N
The variance of the estimate is:
ð
^ N k Pðzj! k Þ 1 Pðzj! k ÞÞ
P
Var½Pðzj! k Þ ¼ 2 ð5:23Þ
N
ð N k þ 2 Þ
Comparing (5.22) and (5.23) with (5.20) and (5.21) the conclusion is
that the variance of the estimate is reduced at the cost of a small bias. See
also exercise 4.
5.3 NONPARAMETRIC LEARNING
Nonparametric methods are learning methods for which prior knowledge
about the functional form of the conditional probability distributions is