Page 31 - Biosystems Engineering
P. 31

12    Cha pte r  O n e

               gradient of the error function to reduce the error function. The forward
               and backward passes are repeated until a prespecified stopping crite-
               rion is achieved or the error function is reduced significantly.
                   A number of methods are proposed to improve the performance
               of the steepest gradient method described earlier, which is based on
               the first derivatives of the error function with respect to the synaptic
               weights. Newton’s method is used to speed up training by employing
               second derivatives of the error function with respect to the synaptic
               weights. The Gauss–Newton method is designed to approach second-
               order training speed without calculating the second derivatives. The
               Levenberg–Marquardt learning algorithm speeds up the learning pro-
               cess and produces enhanced learning performance by combining the
               standard gradient technique with the Gauss–Newton method.
                   An example of an unsupervised learning paradigm is the Kohonen
               learning rule (Kohonen 2001), which is used for training a self-organizing
               map (SOM). The design of an SOM starts with defining a geometric
               configuration for the partitions in a one- or two-dimensional grid. Then,
               random weight vectors are assigned to each partition. During training,
               an input pattern (input vector) is picked randomly. The weight vector
               closest to the profile is identified. The identified weight vector and its
               neighbors are adjusted to look similar to the input vector. This process
               is repeated until the weight vectors converge. During operation, SOM
               maps input patterns to the relevant partitions based on the reference
               vectors to which they are most similar.
                   Model evaluation is performed after learning is completed to prove
               the adequacy or to detect the inadequacy of the machine learning
               model. The latter could arise from an inappropriate selection of net-
               work topology, too small or too many neurons, or from insufficient
               training or overtraining. Incorrect input node assignments, noisy
               data, error in the program code, or several other effects may also
               cause a poor fit. The aim of model evaluation is to ensure that the
               model fit is correct, that the model satisfies the desired requirements,
               and that it serves as a general model. A general model is one whose
               input–output relationships (derived from the training dataset) apply
               equally well to new sets of data (previously unseen test data) from
               the same problem not included in the training set. The main goal of
               machine learning–based modeling is thus the generalization to new
               data of the relationships learned on the training set.
                   Various methods have been used to test the generalization capabil-
               ity of a model. These include the k-fold cross-validation, bootstrapping,
               and holdout methods. In k-fold cross-validation, we divide the training
               dataset into k subsets of (approximately) equal size. We train the model
               k times, each time leaving out one of the subsets from training, but using
               only the omitted subset to compute the prediction accuracy. If k equals
               the sample size, then the method is called ”leave-one-out“ cross-
               validation. In the leave-one-out method, one sample is selected as a
               validation sample and the model is trained using the remaining dataset.
   26   27   28   29   30   31   32   33   34   35   36