Page 98 -
P. 98
80 3 Data Mining
Fig. 3.11 Occurrences of episodes E1and E2
we find the sequence a,c,b,e,d
. Clearly all the requirements are met in the se-
quence: a is followed by b and this b is followed by d,the same a is also followed
by c and this c is followed by the same d.
Now consider episode E2 and again slide a window of length 5 from left to right.
This pattern is much more frequent. Figure 3.11 shows all time windows in which
the pattern occurs. In total there are 16 windows where E2 is embedded. Note that
the only requirement is that both b and c occur: no ordering relation is defined.
Episode E3 does not occur in the time sequence if we use a window length of 5.
There is no window of length of 5 where the sequence a,b,c,d
is embedded. If
the window length is extended to 6, E3 occurs once. The corresponding window
starts at time 26. Here we find the sequence a,e,b,e,c,d
.
The support of an episode is the fraction of windows in which the episode occurs.
For a window size of 5 time units, the support of E1is1/32, the support of E2is
16/32 = 0.5, and the support of E3is0/32 = 0. Like for sequence mining and
association rule learning, we define a threshold for the support. All episodes having
a support of at least this threshold are frequent. For example, if the threshold is 0.2
then E2 is frequent whereas E1 and E3 are not.
The goal is to generate all frequent episodes. Note that there are typically many
potential candidates (all partial orders over the set of event types). Fortunately, like
in the Apriori algorithm, we can exploit a monotonicity property to quickly rule
out bad candidates. To explain this property, we need to define the notion of a
subepisode. E1 is a subepisode of E3 because E1 is a subgraph of E3, i.e., the
nodes and arcs of E1 are contained in E3. E2 is a subepisode of both E1 and E3.
It is easy to see that, if an episode E is frequent, then also all of its subepisodes are
frequent. This monotonicity property can be used to speed-up the search process.
Frequent episodes can also be used to create rules of the form X ⇒ Y where X
is a subepisode of Y. As before the confidence of such a rule can be computed. In
our example, rule E1 ⇒ E3 has a confidence of 0/1 = 0, i.e., a very poor rule. Rule
E2 ⇒ E1 has a confidence of 1/16.
Episode mining and sequence mining can be seen as variants of association rule
learning. Because they take into account the ordering of events, they are related to
process discovery. However, there are many differences with process mining algo-
rithms. First of all, only local patterns are considered, i.e., no overall process model
is created. Second, the focus is on frequent behavior without trying to generate mod-
els that also exclude behavior. Consider, for example, episode E1in Fig. 3.10.Also