Page 275 -

P. 275

254 5 Segmentation

(a) (b) (c) (d)

Figure 5.15 Coarse to ﬁne node aggregation in segmentation by weighted aggregation (SWA) (Sharon, Galun,
Sharon et al. 2006) c 2006 Macmillan Publishers Ltd [Nature]: (a) original gray-level pixel grid; (b) inter-pixel
couplings, where thicker lines indicate stronger couplings; (c) after one level of coarsening, where each original
pixel is strongly coupled to one of the coarse-level nodes; (d) after two levels of coarsening.

−
where Δ = k (τ ik Δ ik )/ k (τ ik ) and τ ik is the boundary length between regions R i and
i
R k . The texture similarity is deﬁned using relative differences between histogram bins of
simple oriented Sobel ﬁlter responses. The pairwise statistics σ + and σ − are used to
local local
compute the likelihoods p ij that two regions should be merged. (See the paper by Alpert,
Galun, Basri et al. (2007) for more details.)
Merging proceeds in a hierarchical fashion inspired by algebraic multigrid techniques
(Brandt 1986; Briggs, Henson, and McCormick 2000) and previously used by Alpert, Galun,
Basri et al. (2007) in their segmentation by weighted aggregation (SWA) algorithm (Sharon,
Galun, Sharon et al. 2006), which we discuss in Section 5.4. A subset of the nodes C ⊂ V
that are (collectively) strongly coupled to all of the original nodes (regions) are used to deﬁne
the problem at a coarser scale (Figure 5.15), where strong coupling is deﬁned as

p ij
j∈C
>φ, (5.25)

p ij
j∈V
with φ usually set to 0.2. The intensity and texture similarity statistics for the coarser nodes
are recursively computed using weighted averaging, where the relative strengths (couplings)
between coarse- and ﬁne-level nodes are based on their merge probabilities p ij . This allows
the algorithm to run in essentially O(N) time, using the same kind of hierarchical aggrega-
tion operations that are used in pyramid-based ﬁltering or preconditioning algorithms. After
a segmentation has been identiﬁed at a coarser level, the exact memberships of each pixel are
computed by propagating coarse-level assignments to their ﬁner-level “children” (Sharon,
Galun, Sharon et al. 2006; Alpert, Galun, Basri et al. 2007). Figure 5.22 shows the segmen-
tations produced by this algorithm compared to other popular segmentation algorithms.

5.3 Mean shift and mode ﬁnding

Mean-shift and mode ﬁnding techniques, such as k-means and mixtures of Gaussians, model
the feature vectors associated with each pixel (e.g., color and position) as samples from an
unknown probability density function and then try to ﬁnd clusters (modes) in this distribution.
Consider the color image shown in Figure 5.16a. How would you segment this image
based on color alone? Figure 5.16b shows the distribution of pixels in L*u*v* space, which
is equivalent to what a vision algorithm that ignores spatial location would see. To make the

270 271 272 273 274 275 276 277 278 279 280