Page 31 - Biosystems Engineering
P. 31
12 Cha pte r O n e
gradient of the error function to reduce the error function. The forward
and backward passes are repeated until a prespecified stopping crite-
rion is achieved or the error function is reduced significantly.
A number of methods are proposed to improve the performance
of the steepest gradient method described earlier, which is based on
the first derivatives of the error function with respect to the synaptic
weights. Newton’s method is used to speed up training by employing
second derivatives of the error function with respect to the synaptic
weights. The Gauss–Newton method is designed to approach second-
order training speed without calculating the second derivatives. The
Levenberg–Marquardt learning algorithm speeds up the learning pro-
cess and produces enhanced learning performance by combining the
standard gradient technique with the Gauss–Newton method.
An example of an unsupervised learning paradigm is the Kohonen
learning rule (Kohonen 2001), which is used for training a self-organizing
map (SOM). The design of an SOM starts with defining a geometric
configuration for the partitions in a one- or two-dimensional grid. Then,
random weight vectors are assigned to each partition. During training,
an input pattern (input vector) is picked randomly. The weight vector
closest to the profile is identified. The identified weight vector and its
neighbors are adjusted to look similar to the input vector. This process
is repeated until the weight vectors converge. During operation, SOM
maps input patterns to the relevant partitions based on the reference
vectors to which they are most similar.
Model evaluation is performed after learning is completed to prove
the adequacy or to detect the inadequacy of the machine learning
model. The latter could arise from an inappropriate selection of net-
work topology, too small or too many neurons, or from insufficient
training or overtraining. Incorrect input node assignments, noisy
data, error in the program code, or several other effects may also
cause a poor fit. The aim of model evaluation is to ensure that the
model fit is correct, that the model satisfies the desired requirements,
and that it serves as a general model. A general model is one whose
input–output relationships (derived from the training dataset) apply
equally well to new sets of data (previously unseen test data) from
the same problem not included in the training set. The main goal of
machine learning–based modeling is thus the generalization to new
data of the relationships learned on the training set.
Various methods have been used to test the generalization capabil-
ity of a model. These include the k-fold cross-validation, bootstrapping,
and holdout methods. In k-fold cross-validation, we divide the training
dataset into k subsets of (approximately) equal size. We train the model
k times, each time leaving out one of the subsets from training, but using
only the omitted subset to compute the prediction accuracy. If k equals
the sample size, then the method is called ”leave-one-out“ cross-
validation. In the leave-one-out method, one sample is selected as a
validation sample and the model is trained using the remaining dataset.