Page 121 -

P. 121

4.2 Event Logs 103

example, start events that are not followed by a corresponding complete event within
45 minutes are removed from the log.
Process mining techniques can be used to automatically discover process models.
In these process models, activities play a central role. These correspond to transi-
tions in Petri nets, tasks in YAWL, functions in EPCs, state transitions in transition
systems, and tasks in BPMN. However, the transactional life-cycle model in Fig. 4.3
shows that there may be multiple events referring to the same activity. Some pro-
cess mining techniques take into account the transactional model whereas others
just consider atomic events. Moreover, sometimes we just want to focus on com-
plete events whereas at other times the focus may be on withdrawals. This can be
supported by ﬁltering (e.g., removing events of a particular type) and by the concept
of a classiﬁer.A classiﬁer is a function that maps the attributes of an event onto
a label used in the resulting process model. This can be seen as the “name” of the
event. In principle, there can be many classiﬁers. However, only one is used at a
time. Therefore, we can use the notation e to refer to the name used in the process
model.

Deﬁnition 4.2 (Classiﬁer) For any event e ∈ E , e is the name of the event.

If events are simply identiﬁed by their activity name, then e = # activity (e).This
means that activity instance a in Fig. 4.4 would be mapped onto a,a,a,a .In
this case the basic α-algorithm (not using transactional information) would create
just one a transition. If events are identiﬁed by their activity name and transac-
tion type, then e = (# activity (e),# trans (e)). Now activity instance a would be mapped
onto (a,schedule),(a,assign),(a,start),(a,complete) and the basic α-algorithm
would create four transitions referring to a’s life-cycle. As shown in Sect. 5.2.4,
transaction type attributes such as start, complete, etc. can be exploited to create a
two-level process model that hides the transactional life-cycles of individual activi-
ties in subprocesses. It is also possible to use a completely different classiﬁer, e.g.,
e = # resource (e). In this case events are named after the resources executing them.
In this book, we assume the classiﬁer e = # activity (e) as the default classiﬁer.This
is why we considered the activity attribute to be mandatory in our initial examples.
From now on, we only require a classiﬁer.

Sequences
Sequences are the most natural way to present traces in an event log. When
describing the operational semantics of Petri nets and transition systems, we
also modeled behavior in terms of sequences. Given their importance, we
introduce some useful operators on sequences.

116 117 118 119 120 121 122 123 124 125 126