Page 101 -

P. 101

3.6 Quality of Resulting Models 83

Fig. 3.13 Confusion matrix
for the decision tree shown in
Fig. 3.2. Of the 200 students
who failed, 178 are classiﬁed
as failed and 22 are classiﬁed
as passed. None of the failing
students was classiﬁed as cum
laude. Of the 198 students
who passed, 175 are classiﬁed
correctly, 21 were classiﬁed
as failed, and 2 as cum laude

we concentrate on k-fold cross-validation. Finally, we conclude with a more general
discussion on Occam’s razor.

3.6.1 Measuring the Performance of a Classiﬁer

In Sect. 3.2, we showed how to construct a decision tree. As discussed, there are
many design decisions when developing a decision tree learner (e.g., selection of
attributes to split on, when to stop splitting, and determining cut values). The ques-
tion is how to evaluate the performance of a decision tree learner. This is relevant
for judging the trustworthiness of the resulting decision tree and for comparing dif-
ferent approaches. A complication is that one can only judge the performance based
on seen instances although the goal is also to predict good classiﬁcations for unseen
instances. However, for simplicity, let us ﬁrst assume that we ﬁrst want to judge the
result of a classiﬁer (like a decision tree) on a given data set.
Given a data set consisting of N instances we know for each instance what the ac-
tual class is and what the predicted class is. For example, for a particular person that
smokes, we may predict that the person will die young (predicted class is “young”),
even though the person dies at age 104 (actual class is “old”). This can be visualized
using a so-called confusion matrix. Figure 3.13 shows the confusion matrix for the
data set shown in Table 3.2 and the decision tree shown in Fig. 3.2. The decision
tree classiﬁes each of the 420 students into an actual class and a predicted class. All
elements on the diagonal are predicted correctly, i.e., 178 + 175 + 18 = 371 of the
420 students are classiﬁed correctly (approximately 88%).
There are several performance measures based on the confusion matrix. To deﬁne
these let us consider a data set with only two classes: “positive” (+) and “negative”
(−). Figure 3.14(a) shows the corresponding 2 × 2 confusion matrix. The following
entries are shown:
• tp is the number of true positives, i.e., instances that are correctly classiﬁed as
positive.
• fn is the number of false negatives, i.e., instances that are predicted to be negative
but should have been classiﬁed as positive.

96 97 98 99 100 101 102 103 104 105 106