Page 166 - Introduction to Statistical Pattern Recognition
P. 166
148 Introduction to Statistical Pattern Recognition
(4.74)
(4.75)
(4.76)
y(Zj): a variable wirh constraint y(Zj) > 0 ,
where N is the total number of samples, and sign(.) is either +1 or -1 depend-
ing on the sign of its argument. In (4.74), yCZj) is selected as I WTZj I so that,
only when WTZjcO, the contribution to E2 is made with (WTZj)2 weighting.
On the other hand, (4.75) counts the number of samples which give WTZj<O.
in the third criterion, we adjust y(Zj) as variables along with W. However, the
y(Zj)'s are constrained to be positive.
These criteria perform well, but, because of the nonlinear functions such
as I I, sign (.), and y(Zj)>O, the explicit solutions of W which minimize these
criteria are hard to obtain. Therefore, a search technique, such as the gradient
method, must be used to find the optimum W.
The gradient method for minimizing a criterion is given by
w(t+l)=w(t)-p-$w(;), az
(4.77)
where 2 indicates the Lth iterative step, and p is a positive constant.
Again, we cannot calculate aE2/aW because of the nonlinear functions
involved in z2. However, in the linear case of (4.64), &*law can be obtained
as follows. Replacing the expectation of (4.64) by the sample mean,