Page 177 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 177

166                                        SUPERVISED LEARNING

            where i is the iteration count. The iteration procedure stops when
            w(i þ 1) ¼ w(i), i.e. when all samples in the training set are classified
            correctly. If such a solution exists, that is if the training set is linearly
            separable, the perceptron learning rule will find it.
              Instead of processing the full training set in one update step (so-called
            batch processing) we can also cycle through the training set and update
            the weight vector whenever a misclassified sample has been encountered
            (single sample processing). If y is a misclassified sample, then the
                                         n
            learning rule becomes:
                                 wði þ 1Þ¼ wðiÞþ  c n y n              ð5:45Þ

            The variable c n is þ1if y is a misclassified ! 1 sample. If y is a
                                     n                                 n
            misclassified ! 2 sample, then c n ¼ 1.
              The error correction procedure of (5.45) can be extended to the multi-
                                                T
            class problem as follows. Let g k (y) ¼ w y as before. We cycle through
                                                k
            the training set and update the weight vectors w k and w j whenever a ! k
            sample is classified as class ! j :

                                     w k ! w k þ  y n
                                                                       ð5:46Þ
                                     w j ! w j    y n

            The procedure will converge in a finite number of iterations provided
            that the training set is linearly separable. Perceptron training is illus-
            trated in Example 5.5.

            Least squared error learning

            A disadvantage of the perceptron learning rule is that it only works well
            in separable cases. If the training set is not separable, the iterative
            procedure often tends to fluctuate around some value. The procedure
            is terminated at some arbitrary point, but it becomes questionable then
            whether the corresponding solution is useful.
              Non-separable cases can be handled if we change the performance
            measure such that its maximization boils down to solving a set of linear
            equations. Such a situation is created if we introduce a set of so-called
            target vectors. A target vector t n is a K-dimensional vector associated
            with the (augmented) sample y . Its value reflects the desired response of
                                       n
            the discriminant function to y . The simplest one is place coding:
                                       n

                                         1   if   n ¼ ! k
                                  t n;k ¼                              ð5:47Þ
                                         0   otherwise
   172   173   174   175   176   177   178   179   180   181   182