Page 235 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 235
224 UNSUPERVISED LEARNING
measure is used with q ¼ 1, the resulting mapping is also called a Sammon
mapping.
The effects of using different values of q are demonstrated in Table 7.2.
Here, for each city in the list, the nearest and the furthest cities are given.
For each pair, the true geodesic distance from Table 7.1 is shown
together with the corresponding distances in the MDS mappings.
Clearly, for short distances, q ¼ 2 is more accurate. For long distances,
q ¼þ2 is favourable. In Figure 7.3(a) to (c) the nearest cities (according
to the geodesic distances) are indicated by solid thin lines. In addition,
the nearest cities according to the MDS maps are indicated by dotted
thick lines, Clearly, if q ¼ 2, the nearest neighbours are best preserved.
This is expected because the relations between nearest neighbours are
best preserved if the local structure is best preserved.
Figure 7.3 also shows a three-dimensional MDS mapping of the world
cities. For clarity, a sphere is also included. MDS is able to find a 2D to
3D mapping such that the resulting map resembles the true geographic
position on the earth surface. The positions are not exactly on the sur-
face of a sphere because the input of the mapping is a set of geodesic
distances, while the output map measures distances according to a three-
dimensional Euclidean metric.
The MDS algorithm has the same problems any gradient-based
method has: it is slow and it cannot be guaranteed to converge to a
global optimum. Another problem the MDS algorithm in particular
suffers from is the fact that the number of distances in a data set grows
quadratically with N S . If the data sets are very large, then we need to
select a subset of the points to map in order to make the application
tractable. Finally, a major problem specific to MDS is that there is no
clear way of finding the projection y new of a new, previously unseen
point z new . The only way of making sure that the new configuration
fy , i ¼ 1, .. . N S ; y new g minimizes (7.5) is to recalculate the entire MDS
i
mapping. In practice, this is infeasible. A crude approximation is to use
triangulation: to map down to D dimensions, search for the D þ 1
nearest neighbours of z new among the training points z i . A mapping
y new for z new can then be found by preserving the distances to the
projections of these neighbours exactly. This method is fast, but inaccur-
ate: because it uses nearest neighbours, it will not give a smooth inter-
polation of originally mapped points. A different, but more complicated
solution is to use an MDS mapping to train a neural network to predict
the y for given z i . This will give smoother mappings, but brings with it
i
all the problems inherent in neural network training. An example of how
to use MDS in PRTools is given in Listing 7.2.