Page 97 -
P. 97
3.5 Sequence and Episode Mining 79
Fig. 3.9 A timed sequence of events and the corresponding time windows
Fig. 3.10 Three episodes
an episode is appearing. An episode defines a partial order. The goal is to discover
frequent episodes.
Input for episode mining is a time sequence as shown in Fig. 3.9. The timed
sequence starts at time 10 and ends at time 37. The sequence consists of discrete
time points, and, as shown in Fig. 3.9, at some points in time an event occurs. An
event has a type (e.g., the activity that happened) and a timestamp. For example,
an event of type a occurs at time 12, an event of type c occurs at time 13,etc.Fig-
ure 3.9 also shows 32 time windows of length 5. These are all the windows (partially)
overlapping with the timed sequence. The length 5 is a predefined parameter of the
algorithm used to discover frequent patterns. An episode occurs in a time window
if the partial order is “embedded” in it.
Figure 3.10 shows three episodes. An episode is described by a directed acyclic
graph. The nodes refer to event types and the arcs define a partial order. For example,
episode E1 defines that a should be followed by b and c, b should be followed by d,
and c should be followed by d. Episode E2 merely states that b and c should both
happen at least once. Episode E3 states that a should be followed by b and c, b
should be followed by c, b should be followed by d, and c should be followed by d.
This episode contains two redundant arcs: the arc from a to c and the arc from b
to d can be removed without changing the requirements. An episode occurs in a
time window if it is possible to assign events to nodes in the episode such that the
ordering relations are satisfied. Note that the episode only defines the minimal set
of events, i.e., there may be all kinds of additional events. The key requirement is
that the episode is embedded.
To illustrate the notion of “occurring in a time window”, we use Fig. 3.11. Con-
sider episode E1 and slide a window of length 5 from left to right. There are 32
possible positions. However, just one of the 32 windows embeds episode E1. This
is the window starting at time 12 shown below the timed sequence in Fig. 3.11.Here