Page 230 -
P. 230

218    5 Neural Networks


                          solution of  (5-93) depends only on the dot products x:xi   Let us denote by  di the
                          optimal  Lagrangian  multipliers.  The  optimal  weight  vector  is,  therefore,  from
                          (5-91):






                            The optimal bias can be derived using the optimal weight vector and condition
                          (5-86), as follows:





                          where x,  and x,  are any support vectors from the +1 and -1  classes, respectively.
                          Further  details  concerning this  optimisation problem can  be  found  in  (Fletcher,
                          1999),  namely  a  discussion  of  the  so-called  Kuhn-Tucker  conditions,  which
                          constrain the Lagrange multipliers, except for the support vectors, to be zero.
                             Let us  see how  the Lagrange multipliers method  works in  a simple example,
                          depicted in Figure 5.45, with two points, xl=[O  01' and x2=[l 0]', for class +1, and
                          two points, x3=[2 01' and x4=[0 2]', for class -1.



















                          Figure 5.45.  An optimal linear discriminant derived by quadratic programming.



                             We  start  by  solving  the  dual  problem,  where  the  function  Q(a) is  readily
                          determined from the dot products of the sample vectors:





                             Differentiating with respect to the ds and using condition (5-92), the following
                           system of equations is obtained:
   225   226   227   228   229   230   231   232   233   234   235