Page 179 - Statistics and Data Analysis in Geology
P. 179

Next Page
             Statistics and Data Analysis in  Geology - Chapter 6

                                      A     B     C    D     E     F
                                       U




                              -0.5
                              -1 .o                               a

                                      A     B    C     D     E    F
                               1 .o    U          T          I




                              -0.5
                              -1 .o                               b

                                      A     B    C     D     E    F





                              -0.5
                              -1 .o                                C

             Figure 6-5.  (a) Dendrogram with  initial clusters,  CD and AB. (b) Connection  of object
                   E to  initial  cluster  CD. (c) Final  connection  of  two  clusters  AB and  CDE, and
                   connection of isolated object F to CDE, completing dendrogram.

             recalculations. Again, the highest correlations in each column are shown in bold-
             face type.
                 The clustering procedure is now repeated; mutually high pairs are sought out
             and clustered.  In this cycle, object E joins cluster CD (Fig.  6-5  b) to form cluster
             CDE. The correlations between cluster CDE and other clusters or individual objects
             such as F  are again found by adding together the common elements and dividing
             by  2.  This process is repeated  again and again until all objects and clusters are
             joined together. The final matrix of  similarities will be a 2 x 2 matrix between the
             last remaining object and everything else collected into a single cluster, as shown
             in Table  6-8.  This indicates that cluster ABCDE has a resemblance of? = -0.27
             with object F. Our dendrogram can then be completed (Fig. 6-5  c).
                 Clustering is an efficient way of displaying complex relationships among many
             objects.  However, the process  of  averaging together members  of  a cluster  and
             treating them as a single new object introduces distortions into the dendrogram.
             This distortion becomes increasingly apparent as successive levels of  clusters are
             averaged together.  We  can evaluate the severity of  this distortion by examining
             what numerical taxonomists call the matrix of  cophenetic values. This is nothing
             more than a matrix of apparent correlations contained within the dendrogram. For
             example, the dendrogram in Figure 6-5  implies that the correlations between C, D,
             and E, on one hand, with A and B, on the other, are all ? = 0.45. Similarly, the corre-
             lation between F  and E is the same as the correlation between F  and D, or between
             F  and any of the other objects. Only the correlations between A and B and between


             492
   174   175   176   177   178   179   180   181   182   183   184