Page 178 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 178

NONPARAMETRIC LEARNING                                       167

            (We recall that   n is the class label of sample y .) This target function
                                                       n
            aims at a classification with minimum error rate.
              We now can apply a least squares criterion to find the weight vectors:


                                      N S  K             2
                                              T
                                J LS  ¼     w y   t n;k                ð5:48Þ
                                     X X
                                              k n
                                     n¼1 k¼1
            The values of w k that minimize J LS  are the weight vectors of the least
            squared error criterion.
              The solution can be found by rephrasing the problem in a different nota-
                                  T
            tion. Let Y ¼ [ y .. . y N S  ] be a N S   (N þ 1) matrix, W ¼ [ w 1 .. . w K ]a
                          1
                                                T
                                                ] a N S   K matrix. Then:
            (N þ 1)   K matrix, and T ¼ [ t 1 .. . t N S
                                    J LS  ¼ YW   Tk 2                  ð5:49Þ
                                         k
                     2
            where kk   is the Euclidean matrix norm, i.e. the sum of squared

            elements. The value of W that minimizes J LS  is the LS solution to the
            problem:

                                                   T
                                            T

                                   W LS ¼ Y Y    1 Y T                 ð5:50Þ
                                                     T
            Of course, the solution is only valid if (Y Y)  1  exists. The matrix
              T
                  1
                     T
            (Y Y) Y is the pseudo inverse of Y. See (3.25).
              An interesting target function is:
                                     t n;k ¼ Cð! k j  n Þ              ð5:51Þ

            Here, t n embeds the cost that is involved if the assigned class is ! k
            whereas the true class is   n . This target function aims at a classification
            with minimal risk and the discriminant function g k (y) attempts to
            approximate the risk  P K  C(! k j! i )P(! i jy) by linear LS fitting. The
                                   i¼1
            decision function in (5.35) should now involve a minimization rather
            than a maximization.
              Example 5.5 illustrates how the least squared error classifier can be
            found in PRTools.

              Example 5.5   Classification of mechanical parts, perceptron and
              least squared error classifier
              Decision boundaries for the mechanical parts example are shown in
              Figure 5.9(a) (perceptron) and Figure 5.9(b) (least squared error
   173   174   175   176   177   178   179   180   181   182   183