Page 288 -

P. 288

3:20
2011/6/1
#9
Page 251
HAN 13-ch06-243-278-9780123814791
6.2 Frequent Itemset Mining Methods 251

Figure 6.2 Generation of the candidate itemsets and frequent itemsets, where the minimum support
count is 2.

4. Next, the transactions in D are scanned and the support count of each candidate
itemset in C 2 is accumulated, as shown in the middle table of the second row in
Figure 6.2.
5. The set of frequent 2-itemsets, L 2 , is then determined, consisting of those candidate
2-itemsets in C 2 having minimum support.
6. The generation of the set of the candidate 3-itemsets, C 3 , is detailed in Figure 6.3.
From the join step, we ﬁrst get C 3 = L 2 1 L 2 = {{I1, I2, I3}, {I1, I2, I5}, {I1, I3, I5},
{I2, I3, I4}, {I2, I3, I5}, {I2, I4, I5}}. Based on the Apriori property that all subsets
of a frequent itemset must also be frequent, we can determine that the four latter
candidates cannot possibly be frequent. We therefore remove them from C 3 , thereby
saving the effort of unnecessarily obtaining their counts during the subsequent scan
of D to determine L 3 . Note that when given a candidate k-itemset, we only need to
check if its (k − 1)-subsets are frequent since the Apriori algorithm uses a level-wise

283 284 285 286 287 288 289 290 291 292 293