Page 164 -

P. 164

152 5 Neural Networks

The gradient is now expressed compactly, using (5-2), as:

Therefore, each weight is updated by summing the following correction:

For the particular case of two classes we need only to consider one linear
decision function. The increment of the weight vector can be then written
compactly as:

This equation shows that the weight vector correction for each pattern depends
on the deviation between the discriminant unit output from the target value,
multiplied by the corresponding feature vector. If we update the weights by using
the total energy function, we just have to sum the derivatives of equation (5-7c) for
all patterns, or equivalently add up the increments expressed by equation (5-7d).
This mode of operation is called the batch mode of gradient descent. An iteration
involving the sum of the increments for all patterns is called an epoch.
Note that LMS adjustment of discriminants produces approximations to target
values whatever they are, be they class labels or not. Therefore, we may as well use
this approach in regression problems.
As a matter of fact, even a simple device such as an LMS adjusted discriminant
can perform very useful tasks, namely in solving regression problems, and we now
present such an example of a regression application to signal adaptive filtering.
The theory of adaptive filtering owes much to the works of Bernard Widrow
concerning adaptive LMS filtering for noise cancelling (see Widrow et al., 1975).
Let us consider an electrocardiographic signal (ECG) with added 50 Hz noise,
induced by the main power supply (a common situation in electrocardiography),
shown in Figure 5.3. The reader can follow this example using the ECG 5OHz.xls
file, where this figure, representing 3.4 seconds of a signal sampled at 500 Hz with
amplitude in microvolts, is included.
In order to remove the noise from the signal we will design an LMS adjusted
discriminant, which will attempt to regress the noise. As the noise has zero mean
we will not need any bias weight. The discriminant just has to use adequate inputs
in order to approximate the amplitude and the phase angle of the sinusoidal noise.
Since there are two parameters to adjust (amplitude and phase), we will then use

159 160 161 162 163 164 165 166 167 168 169