Page 235 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 235

224                                     UNSUPERVISED LEARNING

            measure is used with q ¼ 1, the resulting mapping is also called a Sammon
            mapping.
              The effects of using different values of q are demonstrated in Table 7.2.
            Here, for each city in the list, the nearest and the furthest cities are given.
            For each pair, the true geodesic distance from Table 7.1 is shown
            together with the corresponding distances in the MDS mappings.
            Clearly, for short distances, q ¼ 2 is more accurate. For long distances,
            q ¼þ2 is favourable. In Figure 7.3(a) to (c) the nearest cities (according
            to the geodesic distances) are indicated by solid thin lines. In addition,
            the nearest cities according to the MDS maps are indicated by dotted
            thick lines, Clearly, if q ¼ 2, the nearest neighbours are best preserved.
            This is expected because the relations between nearest neighbours are
            best preserved if the local structure is best preserved.
              Figure 7.3 also shows a three-dimensional MDS mapping of the world
            cities. For clarity, a sphere is also included. MDS is able to find a 2D to
            3D mapping such that the resulting map resembles the true geographic
            position on the earth surface. The positions are not exactly on the sur-
            face of a sphere because the input of the mapping is a set of geodesic
            distances, while the output map measures distances according to a three-
            dimensional Euclidean metric.
              The MDS algorithm has the same problems any gradient-based
            method has: it is slow and it cannot be guaranteed to converge to a
            global optimum. Another problem the MDS algorithm in particular
            suffers from is the fact that the number of distances in a data set grows
            quadratically with N S . If the data sets are very large, then we need to
            select a subset of the points to map in order to make the application
            tractable. Finally, a major problem specific to MDS is that there is no
            clear way of finding the projection y new  of a new, previously unseen
            point z new . The only way of making sure that the new configuration
            fy , i ¼ 1, .. . N S ; y new g minimizes (7.5) is to recalculate the entire MDS
              i
            mapping. In practice, this is infeasible. A crude approximation is to use
            triangulation: to map down to D dimensions, search for the D þ 1
            nearest neighbours of z new among the training points z i . A mapping
            y new  for z new can then be found by preserving the distances to the
            projections of these neighbours exactly. This method is fast, but inaccur-
            ate: because it uses nearest neighbours, it will not give a smooth inter-
            polation of originally mapped points. A different, but more complicated
            solution is to use an MDS mapping to train a neural network to predict
            the y for given z i . This will give smoother mappings, but brings with it
                i
            all the problems inherent in neural network training. An example of how
            to use MDS in PRTools is given in Listing 7.2.
   230   231   232   233   234   235   236   237   238   239   240