Page 29 - Biosystems Engineering
P. 29
10 Cha pte r O n e
adaptive algorithm based on the social metaphor of flocking birds (or
schooling fish or swarming insects). In PSO, a population of individ-
uals adapt by a stochastic search of successful regions of the search
space, influenced by their own success, as well as that of their neigh-
bors. Individual particles move stochastically in the direction of their
own previous best position, as well as the best position discovered by
the entire swarm. Alternatively, a neighborhood approach can be
used where instead of moving in the direction of the best position
discovered by the entire swarm, each particle moves toward the best
position discovered among a localized group of particles, termed the
“neighborhood.” Because the change in particle trajectory is based on
the position of the particle’s own best position, as well as the global
(or neighborhood) best position, the essence of the PSO algorithm is
that each particle will continuously focus and refocus the efforts of its
search within these two regions. Each particle in the swarm represents
a candidate solution to the optimization problem and is evaluated at
each update by a performance function. PSO is a simple algorithm
that has been shown to perform well for optimization of a wide range
of functions, often locating optima in difficult multimodal search
spaces faster than traditional optimization techniques. Detailed infor-
mation on swarm intelligence and the PSO algorithm can be found in
Clerc and Kennedy (2002) and Engelbrecht (2003).
1.2.6 Steps in Developing Machine Learning Models
Assuming that data preprocessing is performed, the steps for build-
ing a machine learning model (e.g., a classifier) involve model struc-
ture selection, learning, and model evaluation.
Model structure selection refers to the choice of a machine learning
paradigm. It includes the choice of a neural network architecture,
fuzzy rules, membership functions, fuzzy operators, genetic opera-
tors, coding scheme, and kernels. The selection of a neural network
architecture includes choosing activation functions, number of layers,
number of neurons in each layer, and interconnection of neurons and
layers. It is known that too many neurons degrade the effectiveness of
the model, as the number of connection weights in the model may
cause overfitting and loss of a model’s generalization. Too few hidden
neurons may not capture the full complexity of the data. Many theo-
ries have been suggested for finding the optimal number of neurons in
the hidden layer (Moody 1992; Amari 1995; Maass 1995). However,
most users employ trial-and-error methods in which NN training
starts with a small number of hidden neurons, and additional neurons
are gradually added until some performance goal is satisfied.
In fuzzy logic–based modeling, model structure selection includes
the choice of (1) shapes of the membership functions (trapezoidal, gaussian,
etc.) of fuzzy sets; (2) a qualitative set of rules that can model the under-
lying process; and (3) fuzzy operators to handle rule conjunctions