Page 230 -

P. 230

218 5 Neural Networks

solution of (5-93) depends only on the dot products x:xi Let us denote by di the
optimal Lagrangian multipliers. The optimal weight vector is, therefore, from
(5-91):

The optimal bias can be derived using the optimal weight vector and condition
(5-86), as follows:

where x, and x, are any support vectors from the +1 and -1 classes, respectively.
Further details concerning this optimisation problem can be found in (Fletcher,
1999), namely a discussion of the so-called Kuhn-Tucker conditions, which
constrain the Lagrange multipliers, except for the support vectors, to be zero.
Let us see how the Lagrange multipliers method works in a simple example,
depicted in Figure 5.45, with two points, xl=[O 01' and x2=[l 0]', for class +1, and
two points, x3=[2 01' and x4=[0 2]', for class -1.

Figure 5.45. An optimal linear discriminant derived by quadratic programming.

We start by solving the dual problem, where the function Q(a) is readily
determined from the dot products of the sample vectors:

Differentiating with respect to the ds and using condition (5-92), the following
system of equations is obtained:

225 226 227 228 229 230 231 232 233 234 235