Page 166 - Introduction to Statistical Pattern Recognition
P. 166

148                        Introduction to Statistical Pattern Recognition




                                                                                  (4.74)




                                                                                  (4.75)




                                                                                  (4.76)


                                     y(Zj): a variable wirh constraint y(Zj) > 0 ,


                       where N is the total number of samples, and sign(.) is either +1  or -1  depend-
                       ing on the sign of its argument.  In (4.74), yCZj) is selected as  I WTZj I so that,
                       only  when  WTZjcO, the contribution to E2 is  made  with  (WTZj)2 weighting.
                       On  the other hand, (4.75) counts the number of  samples which give WTZj<O.
                       in the third criterion, we adjust y(Zj) as variables along with W.  However, the
                      y(Zj)'s are constrained to be positive.
                           These criteria perform well, but, because of the nonlinear functions such
                       as  I  I, sign (.), and y(Zj)>O, the explicit solutions of  W which minimize these
                       criteria are hard to obtain.  Therefore, a search technique, such as the gradient
                       method, must be used to find the optimum W.
                           The gradient method for minimizing a criterion is given by

                                          w(t+l)=w(t)-p-$w(;),  az
                                                                                   (4.77)


                       where 2 indicates the Lth  iterative step, and p is a positive constant.
                            Again, we  cannot  calculate aE2/aW because of  the  nonlinear functions
                       involved in z2. However, in the linear case of  (4.64), &*law can be obtained
                       as follows.  Replacing the expectation of  (4.64) by the sample mean,
   161   162   163   164   165   166   167   168   169   170   171