Page 80 -
P. 80

3.4 Dimensional Reduction   67


                          Multidimensional  scaling  is  another method  of  representing  data  in  a smaller
                        number of dimensions preserving as much as possible the similarity structure of the
                        data. Unlike factor analysis it is an iterative method. It has a broader  application
                        than the principal components analysis that we have just described, since it makes
                        no  assumptions as to the metric used for assessing similarity. It can even be used
                        with similarity measures that do not satisfy all properties listed in (2-8).
                          The basic idea is to try, in an iterative way, to position the feature vectors in the
                        reduced dimension space so that the distance or dissimilarity values among them
                        are preserved as much as possible. For this purpose, the following quadratic error
                        measure known as stress, is iteratively minimized:






                        where:

                        -  xi, x  are any arbitrary pair of distinct patterns ( i # j  );
                        -  d(xi, x ,)  is the original dissimilarity between xi and xj;
                        -  d*(x,, x ,)  is the transformed dissimilarity in the lower dimensional space;
                        -  f is a monotonic transformation function.

                          The  details  of  this  iterative  process  are  beyond  the  scope  of  this  book.  The
                        interested  reader  can  refer,  for  instance,  to  (Sammon,  1969) and  (Leeuw  and
                        Heisen, 1982).



                         NUMERIC                                                        -
                                                                                         A
                         VALUES
                                     1      2      3      4      5      6      7      8
                                     A      B      C      D     E       F      G      H   I
                                       .O   33.8   10.3   12.3   98.1   25.3   8.6   25.
                                B  33.8       .O   36.4   31.7   76.7   35.4   34.7   50.
                                C   10.3   36.4      .O  15.5   95.9   27.6   11.6   24.
                                D  12.3    31.7   15.5      .O   96.2   20.9   9.6   21.
                                E   98.1   76.7   95.9   96.2      -0  101.4  101.4  115.
                                F   25.3   35.4   27.6   20.9  101.4      .O   20.9   26.










                         Figure  3.14.  Dissimilarity  matrix  for  the  Food  dataset.  Dissimilarity  values
                         correspond to Euclidian distances.
   75   76   77   78   79   80   81   82   83   84   85