Page 177 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 177

166 SUPERVISED LEARNING

where i is the iteration count. The iteration procedure stops when
w(i þ 1) ¼ w(i), i.e. when all samples in the training set are classified
correctly. If such a solution exists, that is if the training set is linearly
separable, the perceptron learning rule will find it.
Instead of processing the full training set in one update step (so-called
batch processing) we can also cycle through the training set and update
the weight vector whenever a misclassified sample has been encountered
(single sample processing). If y is a misclassified sample, then the
n
learning rule becomes:
wði þ 1Þ¼ wðiÞþ c n y n ð5:45Þ

The variable c n is þ1if y is a misclassified ! 1 sample. If y is a
n n
misclassified ! 2 sample, then c n ¼ 1.
The error correction procedure of (5.45) can be extended to the multi-
T
class problem as follows. Let g k (y) ¼ w y as before. We cycle through
k
the training set and update the weight vectors w k and w j whenever a ! k
sample is classified as class ! j :

w k ! w k þ y n
ð5:46Þ
w j ! w j y n

The procedure will converge in a finite number of iterations provided
that the training set is linearly separable. Perceptron training is illus-
trated in Example 5.5.

Least squared error learning

A disadvantage of the perceptron learning rule is that it only works well
in separable cases. If the training set is not separable, the iterative
procedure often tends to fluctuate around some value. The procedure
is terminated at some arbitrary point, but it becomes questionable then
whether the corresponding solution is useful.
Non-separable cases can be handled if we change the performance
measure such that its maximization boils down to solving a set of linear
equations. Such a situation is created if we introduce a set of so-called
target vectors. A target vector t n is a K-dimensional vector associated
with the (augmented) sample y . Its value reflects the desired response of
n
the discriminant function to y . The simplest one is place coding:
n

1 if n ¼ ! k
t n;k ¼ ð5:47Þ
0 otherwise

172 173 174 175 176 177 178 179 180 181 182