Page 168 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 168

NONPARAMETRIC LEARNING                                       157

                                         1
            called the  -nearest neighbours of z. Let   k denote the number of
            samples found with class ! k . An estimate of the conditional density is
            (see (5.28)):



                                                  k
                                    ^ p pðzj! k Þ                      ð5:30Þ
                                             N k VðzÞ

            Combination of (5.18) and (5.30) in the Bayes classification with
            uniform cost function (2.12) produces the following suboptimal classi-
            fication:



                                        n           o                i  N i
                                                ^
                                                P
                                          p
             ^ ! !ðzÞ¼ ! k  with  k ¼ argmax ^ pðzj! i ÞPð! i Þ ¼ argmax
                                  i¼1;...;K              i¼1;...;K  N i VðzÞ N S
                               ¼ argmaxf  i g
                                  i¼1;...;K
                                                                       ð5:31Þ

            The interpretation of this classification is simple. The class assigned to a
            vector z is the class with the maximum number of votes coming from
            samples nearest to z. In literature, this classification method is known as
            k-nearest neighbour rule classification (k-NNR, but in our nomenclature
             -NNR). The special case in which   ¼ 1 is simply referred to as nearest
            neighbour rule classification (NNR or 1-NNR).

              Example 5.3   Classification of mechanical parts, NNR classification
              PRTools can be used to perform  -nearest neighbour classification.
              Listing 5.4 shows how a  -nearest neighbour classifier is trained on
              the mechanical parts data set of Example 2.2. If   is not specified, it
              will be found by minimizing the leave-one-out error. For this data set,
              the optimal   is 7. The resulting decision boundaries are shown in
              Figure 5.5(a). If we set   to 1, the classifier will classify all samples in
              the training set correctly (see Figure 5.5(b)), but its performance on a
              test set will be worse.




            1
             The literature about nearest neighbour classification often uses the symbol k to denote the
            number of samples in a volume. However, in order to avoid confusion with symbols like ! k , T k ,
            etc. we prefer to use  .
   163   164   165   166   167   168   169   170   171   172   173