Page 334 -
P. 334
308 Chapter 8 ■ Classification
done by finding curved or piecewise linear paths between the feature vectors
of each class, but in fact is accomplished by transforming those feature vectors
so that a linear boundary can be found. A simple and clear example of this
situation can be seen in Figure 8.12a, where the vectors of one class completely
surround those of the other. It is obvious that there is no line or plane that can
divide these vectors into the two classes.
A transformation of these vectors can yield a separable set. The vectors
shown are in two dimensions; they lie in a plane. If we add a dimension
and transform the points appropriately into a third dimension, a plane can
be found that divides the classes (Figure 8.12b). The data has been projected
into a different, higher dimensional feature space. In SVM parlance, this
transformation uses a kernel, which is the function that projects the data. There
are many possible kernels; Figure 8.12 shows the result of using a Gaussian
(a radial basis function), but polynomials and other functions can be used,
depending on the data.
(a) (b)
Figure 8.12: (a) Feature vectors for two classes that cannot be separated linearly. (b) The
same vectors after being projected into a third dimension using a radial basis function.
The can now be separated using a plane.
So, the points near the origin are given a larger value in the third dimension
than those farther away, pushing the feature vectors near (0, 0) to a greater
height. The maximal margin plane will cut the points in two in this third
dimension, giving a perfect linear classifier.
Incidentally, SVMs can distinguish between only two classes. If there are
more classes, an SVM classifier must approach them pair-wise. This is true of
any classifier that uses linear discriminants.
This has been a fairly high-level description of support vector machines.
It is a complex subject about which volumes have been written; see Burges,

