Page 301 -
P. 301

Section 9.3  Image Segmentation by Clustering Pixels  269


                            performance depends on what one needs. It is possible to make some general state-
                            ments, though. The general recipe doesn’t guarantee that segments are connected,
                            which may or may not matter. If one is segmenting images to compress them,
                            then encoding the US flag as three segments (red, white and blue) might be a good
                            choice; if one is segmenting to represent objects, this is probably a poor represen-
                            tation, because it regards all white stars as a single segment. If the feature vector
                            contains a representation of the position of the pixel, the segments that result tend
                            to be “blobby,” because pixels that lie very far from the center of a segment will
                            tend to belong to other clusters. This is one way to ensure that segments are con-
                            nected. Representing color information tends to make segmenters better, because
                            in this case it’s hard to get easy images wrong (color doesn’t seem to make hard
                            images easier, though). For some applications, doing well at easy images is enough.

                     9.3.1 Basic Clustering Methods
                            There are two natural algorithms for clustering. In divisive clustering,the entire
                            data set is regarded as a cluster, and then clusters are recursively split to yield a
                            good clustering (Algorithm 9.4). In agglomerative clustering, each data item is
                            regarded as a cluster, and clusters are recursively merged to yield a good clustering
                            (Algorithm 9.3).



                            Make each point a separate cluster
                            Until the clustering is satisfactory
                                Merge the two clusters with the
                                  smallest inter-cluster distance
                            end

                                  Algorithm 9.3: Agglomerative Clustering or Clustering by Merging.




                            Construct a single cluster containing all points
                            Until the clustering is satisfactory
                                Split the cluster that yields the two
                                  components with the largest inter-cluster distance
                            end

                                     Algorithm 9.4: Divisive Clustering, or Clustering by Splitting.

                                 There are two major issues in thinking about clustering:
                               • What is a good inter-cluster distance? Agglomerative clustering uses an inter-
                                 cluster distance to fuse nearby clusters; divisive clustering uses it to split in-
                                 sufficiently coherent clusters. Even if a natural distance between data points
                                 is available (which might not be the case for vision problems), there is no
                                 canonical inter-cluster distance. Generally, one chooses a distance that seems
                                 appropriate for the data set. For example, one might choose the distance be-
   296   297   298   299   300   301   302   303   304   305   306