Page 36 -
P. 36
22 2 Pattern Discrimination
using coefficients or weights w, and w2 and a bias term wo as shown in equation
(2-1). The weights determine the slope of the straight line; the bias determines the
deviation from the origin of the straight line intersects with the coordinates.
Equation (2- 1) also allows interpretation of the straight line as the roots set of a
linear function d(x). We say that d(x) is a lineur decision function that divides
(categorizes) %' into two decision regions: the upper half plane corresponding to
d(x)>0 where each feature vector is assigned to 0,; the lower half plane
corresponding to d(x)<O where each feature vector is assigned to y. The
classification is arbitrary for d(x)=O. Note that class limits do not have to coincide
with decision region boundaries.
The generalization of the linear decision function for a d-dimensional feature
space in nd is straightforward:
where
w = [M,, . . . wd]' is the weight vector; (2-2a)
w * = [% M., . . . M j]' is the augmented weight vector with the bias term; (2-2b)
x * = [I x, . . . x~] is the augmented feature vector. (2-2c)
Figure 2.2. Two-dimensional linear decision function with normal vector n and at
a distance Do from the origin.
The roots set of d(x), the decision surface, or discriminant, is now a linear d-
dimensional surface called a hyperplane that can be characterized (see Figure 2.2)
by its distance Do from the coordinates origin and its unitary normal vector n in the
positive direction (d(x) > 0) as follows (see e.g. Friedman and Kandel, 1999):