Page 97 -
P. 97

3.5 Sequence and Episode Mining                                 79











            Fig. 3.9 A timed sequence of events and the corresponding time windows












            Fig. 3.10 Three episodes

            an episode is appearing. An episode defines a partial order. The goal is to discover
            frequent episodes.
              Input for episode mining is a time sequence as shown in Fig. 3.9. The timed
            sequence starts at time 10 and ends at time 37. The sequence consists of discrete
            time points, and, as shown in Fig. 3.9, at some points in time an event occurs. An
            event has a type (e.g., the activity that happened) and a timestamp. For example,
            an event of type a occurs at time 12, an event of type c occurs at time 13,etc.Fig-
            ure 3.9 also shows 32 time windows of length 5. These are all the windows (partially)
            overlapping with the timed sequence. The length 5 is a predefined parameter of the
            algorithm used to discover frequent patterns. An episode occurs in a time window
            if the partial order is “embedded” in it.
              Figure 3.10 shows three episodes. An episode is described by a directed acyclic
            graph. The nodes refer to event types and the arcs define a partial order. For example,
            episode E1 defines that a should be followed by b and c, b should be followed by d,
            and c should be followed by d. Episode E2 merely states that b and c should both
            happen at least once. Episode E3 states that a should be followed by b and c, b
            should be followed by c, b should be followed by d, and c should be followed by d.
            This episode contains two redundant arcs: the arc from a to c and the arc from b
            to d can be removed without changing the requirements. An episode occurs in a
            time window if it is possible to assign events to nodes in the episode such that the
            ordering relations are satisfied. Note that the episode only defines the minimal set
            of events, i.e., there may be all kinds of additional events. The key requirement is
            that the episode is embedded.
              To illustrate the notion of “occurring in a time window”, we use Fig. 3.11. Con-
            sider episode E1 and slide a window of length 5 from left to right. There are 32
            possible positions. However, just one of the 32 windows embeds episode E1. This
            is the window starting at time 12 shown below the timed sequence in Fig. 3.11.Here
   92   93   94   95   96   97   98   99   100   101   102