Page 281 -
P. 281
260 5 Segmentation
Figure 5.18 Mean-shift color image segmentation with parameters (h s ,h r ,M) = (16, 19, 40) (Comaniciu and
Meer 2002) c 2002 IEEE.
the bias–variance tradeoff, looking for parameter ranges where the number of clusters varies
slowly, optimizing some external clustering criterion, or using top-down (application domain)
knowledge (Comaniciu and Meer 2003). It is also possible to change the orientation of the
kernel in joint parameter space for applications such as spatio-temporal (video) segmentations
(Wang, Thiesson, Xu et al. 2004).
Mean shift has been applied to a number of different problems in computer vision, includ-
ing face tracking, 2D shape extraction, and texture segmentation (Comaniciu and Meer 2002),
and more recently in stereo matching (Chapter 11)(Wei and Quan 2004), non-photorealistic
rendering (Section 10.5.2)(DeCarlo and Santella 2002), and video editing (Section 10.4.5)
(Wang, Bhat, Colburn et al. 2005). Paris and Durand (2007) provide a nice review of such
applications, as well as techniques for more efficiently solving the mean-shift equations and
producing hierarchical segmentations.
5.4 Normalized cuts
While bottom-up merging techniques aggregate regions into coherent wholes and mean-shift
techniques try to find clusters of similar pixels using mode finding, the normalized cuts
technique introduced by Shi and Malik (2000) examines the affinities (similarities) between
nearby pixels and tries to separate groups that are connected by weak affinities.
Consider the simple graph shown in Figure 5.19a. The pixels in group A are all strongly
connected with high affinities, shown as thick red lines, as are the pixels in group B. The
connections between these two groups, shown as thinner blue lines, are much weaker. A
normalized cut between the two groups, shown as a dashed line, separates them into two
clusters.
The cut between two groups A and B is defined as the sum of all the weights being cut,
cut(A, B)= w ij , (5.43)
i∈A,j∈B
where the weights between two pixels (or regions) i and j measure their similarity. Using
a minimum cut as a segmentation criterion, however, does not result in reasonable clusters,
since the smallest cuts usually involve isolating a single pixel.