Page 240 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 240
CLUSTERING 229
14000
12000
10000
8000
6000
4000
2000
0 Rio
Bangkok Beijing Tokyo Cairo London Moscow Honolulu Los Angeles New York Capetown Santiago Melbourne
Figure 7.4 Hierarchical clustering of the data in Table 7.1
Given a set of N S objects z i to be clustered, and an N S N S distance
matrix between these objects, the hierarchical clustering involves the
following steps:
Algorithm 7.2: Hierarchical clustering
1. Assign each object to its own cluster, resulting in N S clusters, each
containing just one object. The initial distances between all clusters
are therefore just the distances between all objects.
2. Find the closest pair of clusters and merge them into a single cluster,
so that the number of clusters reduces by one.
3. Compute the distance between the new cluster and each of the old
clusters, where the distance between two clusters can be defined in a
number of ways (see below).
4. Repeat steps 2 and 3 until all items are clustered into a single cluster
of size N S , or until a predefined number of clusters K is achieved.
In step 3 it is possible to compute the distances between clusters in
several different ways. We can distinguish single-link clustering, aver-
age-link clustering and complete-link clustering. In single-link clus-
tering, the distance between two clusters C i and C j is defined as the