Page 160 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 160

NONPARAMETRIC LEARNING                                       149


                                     ^        N k ðzÞ
                                     P Pðzj! k Þ¼                      ð5:20Þ
                                               N k

                                                ð
                                ^
                                P
                            Var½Pðzj! k ފ ¼  Pðzj! k Þ 1   Pðzj! k ÞÞ  ð5:21Þ
                                                  N k
            For small N and a large training set, this estimator indeed suffices.
            However, if N is too large, the estimator fails. A small example demon-
            strates this. Suppose the dimension of the vector is N ¼ 10. Then the
                                           3
            total number of states is 2 10    10 . Therefore, some states will have a
                                    3
            probability of less than 10 . The uncertainty of the estimated probabil-
                                                4
            ities must be a fraction of that, say 10 . The number of samples, N k ,
                                                                   5
            needed to guarantee such a precision is on the order of 10 or more.
                                                       5
            Needless to say that in many applications 10 samples is much too
            expensive. Moreover, with even a slight increase of N the required
            number of samples becomes much larger.
              One way to avoid a large variance is to incorporate more prior know-
            ledge. For instance, without the availability of a training set, it is known
            beforehand that all parameters are bounded by 0   P(zj! k )   1. If noth-
            ing further is known, we could first ‘guess’ that all states are equally
            likely: P(zj! k ) ¼ 2  N . Based on this guess, the estimator takes the form:


                                   ^        N k ðzÞþ 1
                                   P Pðzj! k Þ¼                        ð5:22Þ
                                             N k þ 2 N

            The variance of the estimate is:


                                                  ð
                               ^        N k Pðzj! k Þ 1   Pðzj! k ÞÞ
                               P
                           Var½Pðzj! k ފ ¼            2               ð5:23Þ
                                                     N
                                              ð N k þ 2 Þ
            Comparing (5.22) and (5.23) with (5.20) and (5.21) the conclusion is
            that the variance of the estimate is reduced at the cost of a small bias. See
            also exercise 4.




            5.3   NONPARAMETRIC LEARNING

            Nonparametric methods are learning methods for which prior knowledge
            about the functional form of the conditional probability distributions is
   155   156   157   158   159   160   161   162   163   164   165