Page 193 -
P. 193
5.5 Multi-Layer Perceptrons 18 1
Let us now consider a regression application of multi-layer perceptrons. For this
purpose we use the Foetal Weight dataset, with the aim of predicting the foetal
weight of 414 newborns based on measurements obtained from echographic
examination of the foetus. Using Statistics's intelligent problem solver an
MLP3:6:1 neural net was found with good performance, using features BPD, AP,
FL as inputs. This neural net was trained with the back-propagation algorithm
using 100 epochs with learning rate 0.02 and momentum 0.3. In one run the RMS
errors (see definition in 5.6.1) obtained were respectively 284.4 g for the training
set (207 cases), 275.4 g for the verification set (103 cases) and 289.8 g for the test
set (104 cases). High correlations (about 86%) between true and predicted values
were found for the three sets.
The proximity of the error figures and correlation values in several runs is an
indication of convergence without over or under-fitting and adequacy of the set
sizes. The regression solution does indeed perform quite well as shown in Figure
5.27, where the 4 14 cases are sorted by increasing value.
5.5.3 Time Series
One particularly interesting type of regression problem with multiple practical
applications is time series forecasting. In this type of problem one has a sequence
of data and wishes to forecast the value of one or more of its variables, some steps
ahead. The sequence of data is typically a time series, such as daily weather
parameters, daily river flow or share values in the stock exchange market, but can
also be any other type of sequence, temporal or not.
Recurrent networks are especially appropriate for this kind of application. A
typical problem is the one of estimating the value of one variable z one step ahead,
i.e., at time t+l, z(t+l), based on its present value and values of other influencing
variables x(t), called external variables:
z(t + 1) = f (z(t), x(t)), or equivalently z(t) = f (z(t - l), x(t -1)). (5-27)
This autoregressive estimation can be performed using a recurrent multi-layer
perceptron with a loop feeding back the output z to the input vector, with a one-unit
delay. Of course one can also use a k-unit delay, then having a forecast of z(t + k),
k steps ahead. It is also possible to have several network outputs fed back to the
inputs in multi-variable forecasting.
Recurrent networks are trained in the same way as other networks. However, the
feedback loops, as is well known from control theory, may sometimes cause an
oscillation effect with rapid divergence of the neural net weights. This divergent
behaviour imposes low learning rates and also a limitation on the number of
feedback loops. In order to choose the most appropriate neural net for time series
forecast, namely which external variables can be of use, it is recommended to first
study the normal regression solution, i.e. to regress z based on x.
As a first example of time series forecast we will consider the Stock Exchange
data (see Appendix A), and set as our task the prediction of firm SONAE share