Page 334 -
P. 334

308   Chapter 8 ■ Classification


                           done by finding curved or piecewise linear paths between the feature vectors
                           of each class, but in fact is accomplished by transforming those feature vectors
                           so that a linear boundary can be found. A simple and clear example of this
                           situation can be seen in Figure 8.12a, where the vectors of one class completely
                           surround those of the other. It is obvious that there is no line or plane that can
                           divide these vectors into the two classes.
                             A transformation of these vectors can yield a separable set. The vectors
                           shown are in two dimensions; they lie in a plane. If we add a dimension
                           and transform the points appropriately into a third dimension, a plane can
                           be found that divides the classes (Figure 8.12b). The data has been projected
                           into a different, higher dimensional feature space. In SVM parlance, this
                           transformation uses a kernel, which is the function that projects the data. There
                           are many possible kernels; Figure 8.12 shows the result of using a Gaussian
                           (a radial basis function), but polynomials and other functions can be used,
                           depending on the data.





















                                                 (a)                               (b)
                           Figure 8.12: (a) Feature vectors for two classes that cannot be separated linearly. (b) The
                           same vectors after being projected into a third dimension using a radial basis function.
                           The can now be separated using a plane.

                             So, the points near the origin are given a larger value in the third dimension
                           than those farther away, pushing the feature vectors near (0, 0) to a greater
                           height. The maximal margin plane will cut the points in two in this third
                           dimension, giving a perfect linear classifier.
                             Incidentally, SVMs can distinguish between only two classes. If there are
                           more classes, an SVM classifier must approach them pair-wise. This is true of
                           any classifier that uses linear discriminants.
                             This has been a fairly high-level description of support vector machines.
                           It is a complex subject about which volumes have been written; see Burges,
   329   330   331   332   333   334   335   336   337   338   339