Page 358 - Computational Statistics Handbook with MATLAB

P. 358

Chapter 9: Statistical Pattern Recognition 347

is found by subtracting the maximum conditional probability
p ω j t( ) for the node from 1:

{
(
rt() = 1 – max p ω t)} . (9.15)
j
j
Rt() is the resubstitution estimate of risk for node t. This is

Rt() = rt()pt() . (9.16)

RT() denotes a resubstitution estimate of the overall misclassification
rate for a tree T. This can be calculated using every terminal node
in the tree as follows

RT() = ∑ rt()pt() = ∑ Rt() . (9.17)
t ∈ T ) t ∈ T )
α is the complexity parameter.

it() denotes a measure of impurity at node t.
,
(
∆is t) represents the decrease in impurity and indicates the good-
ness of the split s at node t. This is given by
(
,
()
()
∆is t) = it() – p R it R – p L it L . (9.18)
are the proportion of data that are sent to the left and right
p L and p R
child nodes by the split s.

Tree
GrowingtheT
Growing the ree
Growingthethe
TTreeree
Growing
The idea behind binary classification trees is to split the d-dimensional space
into smaller and smaller partitions, such that the partitions become purer in
terms of the class membership. In other words, we are seeking partitions
where the majority of the members belong to one class. To illustrate these
ideas, we use a simple example where we have patterns from two classes,
. How we obtain these data are
each one containing two features, x 1 and x 2
discussed in the following example.
Example 9.10
We use synthetic data to illustrate the concepts of classification trees. There
are two classes, and we generate 50 points from each class. From Figure 9.11,
we see that each class is a two term mixture of bivariate uniform random
variables.

353 354 355 356 357 358 359 360 361 362 363