Page 76 -
P. 76

3.3 Tree Clustering   63


                      of  formula (3-1). The solution of  the cross data (Figure 3.2b) was  obtained with
                      this  rule  (available  in  the  SPSS software).  Figure  3.9 shows  the  corresponding
                      dendrogram.

                      Ward's method
                      In Ward's method the sum of the squared within-cluster distances, for the resulting
                      merged cluster, is computed:





                       where m is the centroid of the merged clusters.
                         At  each  step  the  two  clusters  that  merge  are  the  ones  that  contribute  to  the
                       smallest increase of  the overall sum of  the squared within-cluster distances. This
                       method  is reminiscent of  the ANOVA statistical test, in  the sense that  it tries to
                       minimize  the  intra-cluster  variance  and  therefore  the  cluster  separability.  This
                       method  produces,  in  general,  very  good  solutions  although  it  tends  to  create
                       clusters of smaller size.




                                                 Rescaled  Distance  Cluster  Conbine


















                         Figure 3.9. Dendrogram of the +Cross data clustering using the UWGMA rule.




                        3.3.2 Tree Clustering Experiments

                        As  with  any clustering method, when performing tree-clustering experiments it is
                        important to choose appropriate metrics and linkage rules guided by the inspection
                        of  the scatter diagram of  the data. Let  us consider, as an  illustration, the crimes
                        data, which  is shown in  the scatter diagram of  Figure 3.6a. Euclidian or squared
   71   72   73   74   75   76   77   78   79   80   81