Page 254 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB

P. 254

CLUSTERING 243

Let us clarify this with an example. Assume we start with data z n
uniformly distributed in a square in a two-dimensional measurement
space, and we want to map this data into a one-dimensional space.
Therefore, K ¼ 15 neurons are defined. These neurons are ordered, such
that neuron j 1 is the left neighbour of neuron j and neuron j þ 1 is the
right neighbour. In the weighting function ¼ 1 is used, in the update
rule ¼ 0:01; these are not changed during the iterations. The neurons
have to be placed as objects in the feature space such that they represent
the data as best as possible. Listing 7.8 shows an implementation for
training a one-dimensional map in PRTools.

Listing 7.8
PRTools code for training and plotting a self-organizing map.

z ¼ rand(100,2); % Generate the data set z
w ¼ som(z,15); % Train a 1D SOM and show it
figure; clf; scatterd(z); plotsom(w);

In Figure 7.10 four scatter plots of this data set with the SOM (K ¼ 15)
are shown. In the left subplot, the SOM is randomly initialized by
picking K objects from the data set. The lines between the neurons
indicate the neighbouring relationships between the neurons. Clearly,
neighbouring neurons in feature space are not neighbouring in the grid.
In the fourth subplot it is visible that after 100 iterations over the data
set, the one-dimensional grid has organized itself over the square. This
solution does not change in the next 500 iterations.
With one exception, the neighbouring neurons in the measurement
space are also neighbouring neurons in the grid. Only where the one-
dimensional string crosses, neurons far apart in the grid become close
neighbours in feature space. This local optimum, where the map did
not unfold completely in the measurement space, is often encountered in
SOMs. It is very hard to get out of this local optimum. The solution
would be to restart the training with another random initialization.
Many of the unfolding problems, and the speed of convergence, can be
solved by adjusting the learning parameter (i) and the characteristic
(i)
width in the weighting function h (jk(z n ) jj) during the iterations.
Often, the following functional forms are used:

ðiÞ ¼ ð0Þ expð i= 1 Þ
ð7:35Þ
ðiÞ ¼ ð0Þ expð i= 2 Þ

249 250 251 252 253 254 255 256 257 258 259