Page 242 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 242

CLUSTERING                                                   231

             (a)                              (b)
                2                               6
               1.8
                                                5
               1.6
               1.4
                                                4
               1.2
              distance  1                     distance  3
               0.8
                                                2
               0.6
               0.4
                                                1
               0.2
                0                               0
                     1  5  2   4  3   6              1  5   2   4   3   6
                            object                          object
            Figure 7.6 Hierarchical clustering with two different clustering types. (a) Single-
            link clustering. (b) Complete-link clustering
            for single- and complete-link clustering are shown in Figure 7.6. At
            smaller distances, pairs of single objects are combined, at higher dis-
            tances complete clusters. When there is a large gap in the distances, as
            can be seen in the single-link dendrogram, it is an indication that the two
            clusters are far apart. Cutting the dendrogram at height 1.0 will then
            result in a ‘natural’ clustering, consisting of two clusters. In many
            practical cases, the cut is not obvious to define, and the user has to guess
            an appropriate number of clusters.
              Note that this clustering is obtained using a fixed data set. When new
            objects become available, there is no straightforward way to include it in
            an existing clustering. In these cases, the clustering will have to be
            constructed from the beginning using the complete data set.
              In PRTools, it is simple to construct a hierarchical clustering; Listing
            7.4 shows an example. Note that the clustering operates on a distance
            matrix rather than the data set. A distance matrix can be obtained with
            the function distm.

            Listing 7.4
            PRTools code for obtaining a hierarchical clustering.

            z ¼ gendats(5);                 % Generate some data
            figure; clf; scatterd(z);       % and plot it
            dendr ¼ hclust(distm(z),‘s’);   % Single link clustering
            figure; clf; plotdg(dendr);     % Plot the dendrogram
   237   238   239   240   241   242   243   244   245   246   247