Page 121 -
P. 121
4.2 Event Logs 103
example, start events that are not followed by a corresponding complete event within
45 minutes are removed from the log.
Process mining techniques can be used to automatically discover process models.
In these process models, activities play a central role. These correspond to transi-
tions in Petri nets, tasks in YAWL, functions in EPCs, state transitions in transition
systems, and tasks in BPMN. However, the transactional life-cycle model in Fig. 4.3
shows that there may be multiple events referring to the same activity. Some pro-
cess mining techniques take into account the transactional model whereas others
just consider atomic events. Moreover, sometimes we just want to focus on com-
plete events whereas at other times the focus may be on withdrawals. This can be
supported by filtering (e.g., removing events of a particular type) and by the concept
of a classifier.A classifier is a function that maps the attributes of an event onto
a label used in the resulting process model. This can be seen as the “name” of the
event. In principle, there can be many classifiers. However, only one is used at a
time. Therefore, we can use the notation e to refer to the name used in the process
model.
Definition 4.2 (Classifier) For any event e ∈ E , e is the name of the event.
If events are simply identified by their activity name, then e = # activity (e).This
means that activity instance a in Fig. 4.4 would be mapped onto a,a,a,a .In
this case the basic α-algorithm (not using transactional information) would create
just one a transition. If events are identified by their activity name and transac-
tion type, then e = (# activity (e),# trans (e)). Now activity instance a would be mapped
onto (a,schedule),(a,assign),(a,start),(a,complete) and the basic α-algorithm
would create four transitions referring to a’s life-cycle. As shown in Sect. 5.2.4,
transaction type attributes such as start, complete, etc. can be exploited to create a
two-level process model that hides the transactional life-cycles of individual activi-
ties in subprocesses. It is also possible to use a completely different classifier, e.g.,
e = # resource (e). In this case events are named after the resources executing them.
In this book, we assume the classifier e = # activity (e) as the default classifier.This
is why we considered the activity attribute to be mandatory in our initial examples.
From now on, we only require a classifier.
Sequences
Sequences are the most natural way to present traces in an event log. When
describing the operational semantics of Petri nets and transition systems, we
also modeled behavior in terms of sequences. Given their importance, we
introduce some useful operators on sequences.