Page 282 -
P. 282

5.4 Normalized cuts                                                                    261


                                               B
                              A
                         A                B
                                   A             B                 A            B       sum
                           A                                A  assoc(A, A)  cut(A, B)   assoc(A, V )
                                                           B    cut(B, A)  assoc(B, B) assoc(B, V )
                                                         sum   assoc(A, V )  assoc(B, v)
                                     (a)                                     (b)

               Figure 5.19 Sample weighted graph and its normalized cut: (a) a small sample graph and its smallest normalized
               cut; (b) tabular form of the associations and cuts for this graph. The assoc and cut entries are computed as area
               sums of the associated weight matrix W (Figure 5.20). Normalizing the table entries by the row or column sums
               produces normalized associations and cuts Nassoc and Ncut.


                  A better measure of segmentation is the normalized cut, which is defined as
                                               cut(A, B)    cut(A, B)
                                  Ncut(A, B)=            +            ,             (5.44)
                                              assoc(A, V )  assoc(B, V )
               where assoc(A, A)=          w ij is the association (sum of all the weights) within a
                                     i∈A,j∈A
               cluster and assoc(A, V )= assoc(A, A)+ cut(A, B) is the sum of all the weights associated
               with nodes in A. Figure 5.19b shows how the cuts and associations can be thought of as area
               sums in the weight matrix W =[w ij ], where the entries of the matrix have been arranged so
               that the nodes in A come first and the nodes in B come second. Figure 5.20 shows an actual
               weight matrix for which these area sums can be computed. Dividing each of these areas by
               the corresponding row sum (the rightmost column of Figure 5.19b) results in the normalized
               cut and association values. These normalized values better reflect the fitness of a particular
               segmentation, since they look for collections of edges that are weak relative to all of the edges
               both inside and emanating from a particular region.
                  Unfortunately, computing the optimal normalized cut is NP-complete. Instead, Shi and
               Malik (2000) suggest computing a real-valued assignment of nodes to groups. Let x be the
               indicator vector where x i =+1 iff i ∈ A and x i = −1 iff i ∈ B. Let d = W 1 be the row
               sums of the symmetric matrix W and D = diag(d) be the corresponding diagonal matrix.
               Shi and Malik (2000) show that minimizing the normalized cut over all possible indicator
               vectors x is equivalent to minimizing
                                                 T
                                                y (D − W )y
                                            min             ,                       (5.45)
                                                    T
                                             y     y Dy
               where y =((1+x)−b(1−x))/2 is a vector consisting of all 1s and −bs such that y·d =0.
               Minimizing this Rayleigh quotient is equivalent to solving the generalized eigenvalue system
                                            (D − W )y = λDy,                        (5.46)

               which can be turned into a regular eigenvalue problem
                                             (I − N)z = λz,                         (5.47)

               where N = D   −1/2 WD −1/2  is the normalized affinity matrix (Weiss 1999) and z =
               D 1/2 y. Because these eigenvectors can be interpreted as the large modes of vibration in
   277   278   279   280   281   282   283   284   285   286   287