Page 193 -
P. 193

5.5 Multi-Layer Perceptrons   18 1


                                Let us now consider a regression application of multi-layer perceptrons. For this
                              purpose  we use the Foetal  Weight dataset, with  the  aim of  predicting  the foetal
                              weight  of  414  newborns  based  on  measurements  obtained  from  echographic
                              examination  of  the  foetus.  Using  Statistics's  intelligent  problem  solver  an
                              MLP3:6:1  neural net was found with good performance, using features BPD, AP,
                              FL  as  inputs.  This  neural  net  was  trained  with  the  back-propagation  algorithm
                              using  100 epochs with learning rate 0.02 and momentum 0.3. In one run the RMS
                              errors (see definition in 5.6.1) obtained were respectively 284.4 g for the training
                              set (207 cases), 275.4 g for the verification set (103 cases) and 289.8 g for the test
                              set (104 cases). High correlations (about 86%) between true and predicted values
                              were found for the three sets.
                                The  proximity  of  the error figures and correlation values in  several runs is an
                              indication of  convergence  without  over or under-fitting and  adequacy  of  the  set
                              sizes. The regression solution does indeed perform quite well as shown in  Figure
                              5.27, where the 4 14 cases are sorted by increasing value.


                              5.5.3 Time Series

                              One  particularly  interesting  type  of  regression  problem  with  multiple  practical
                              applications is time series forecasting. In this type of problem one has a sequence
                              of data and wishes to forecast the value of one or more of its variables, some steps
                              ahead.  The  sequence  of  data  is  typically  a  time  series,  such  as daily  weather
                              parameters, daily river flow or share values in the stock exchange market, but can
                              also be any other type of sequence, temporal or not.
                                Recurrent  networks  are  especially appropriate  for this  kind  of  application.  A
                              typical problem is the one of estimating the value of one variable z one step ahead,
                              i.e., at time t+l, z(t+l), based on its present value and values of other influencing
                              variables x(t), called external variables:

                                 z(t + 1) = f (z(t), x(t)),  or equivalently z(t) = f (z(t - l), x(t -1)).   (5-27)

                                This autoregressive estimation can be performed  using a recurrent multi-layer
                              perceptron with a loop feeding back the output z to the input vector, with a one-unit
                              delay. Of course one can also use a k-unit delay, then having a forecast of z(t + k),
                              k steps ahead. It is also possible to have several network outputs fed back to the
                              inputs in multi-variable forecasting.
                                Recurrent networks are trained in the same way as other networks. However, the
                              feedback loops, as is well  known  from control theory, may  sometimes cause  an
                              oscillation effect with rapid divergence of  the neural net  weights. This divergent
                               behaviour  imposes  low  learning  rates  and  also  a  limitation  on  the  number  of
                               feedback loops. In order to choose the most appropriate neural net for time series
                               forecast, namely which external variables can be of use, it is recommended to first
                               study the normal regression solution, i.e. to regress z based on x.
                                 As a first example of time series forecast we will consider the Stock Exchange
                               data  (see Appendix A), and set as our task the prediction of  firm SONAE share
   188   189   190   191   192   193   194   195   196   197   198