Page 356 - Computational Statistics Handbook with MATLAB

P. 356

Chapter 9: Statistical Pattern Recognition 345

Example 9.9
We show a simple classification tree in Figure 9.10, where we are concerned
with only two features. Note that all internal nodes have two children and a
splitting rule. The split can occur on either variable, with observations that
are less than that value being assigned to the left child and the rest going to
the right child. Thus, at node 1, any observation where the first feature is less
than 5 would go to the left child. When an observation stops at one of the ter-
minal nodes, it is assigned to the corresponding class for that node. We illus-
trate these concepts with several cases. Say that we have a feature vector
,
given by x = ( 46), then passing this down the tree, we get

node 1 → node 2 ⇒ ω 1 .

,
If our feature vector is x = ( 66), then we travel the tree as follows:
node 1 → node 3 → node 4 → node 6 ⇒ ω 2 .

,
For a feature vector given by x = ( 10 12), we have
node 1 → node 3 → node 5 ⇒ ω 2 .

We give a brief overview of the steps needed to create a tree classifier and
then explain each one in detail. To start the process, we must grow an overly
large tree using a criterion that will give us optimal splits for the tree. It turns
out that these large trees fit the training data set very well. However, they do
not generalize, so the rate at which we correctly classify new patterns is low.
The proposed solution [Breiman, et al., 1984] to this problem is to continually
prune the large tree using a minimal cost complexity criterion to get a
sequence of sub-trees. The final step is to choose a tree that is the ‘right size’
using cross-validation or an independent test sample. These three main pro-
cedures are described in the remainder of this section. However, to make
things easier for the reader, we first provide the notation that will be used to
describe classification trees.

CLASSIFICATION TREES - NOTATION
L denotes a learning set made up of observed feature vectors and
their class label.
J denotes the number of classes.

T is a classification tree.
t represents a node in the tree.

351 352 353 354 355 356 357 358 359 360 361