Page 254 -
P. 254
236 8 Mining Additional Perspectives
“y and z” in all other cases. Based on this classification, the conditions shown in the
YAWL and BPMN models can be derived.
For the predictor variables, all case and event attributes can be used. Consider for
instance the decision point following activity e and event 35654431 in Table 8.1.
The case and event attributes of this event are shown in Tables 8.1 and 8.2. Hence,
predictor variables for event 35654431 are: case = 1, activity = decide, time =
06-01-2011:11.22, resource = Sara, trans = complete, cost = 200, custid = 9911,
name = Smith, type = gold, region = south, and amount = 989.50. As described in
[79] also the attributes of earlier and later events can be taken into account. For ex-
ample, all attributes of all events in the trace up to the decision moment can be used.
In the process shown in Fig. 8.9 one could find the rule that all cases that involve
Sean get rejected in the decision point following activity e.
There may be loops in the model. Hence, the same decision point may be visited
multiple times for the same case. Each visit corresponds to a new row in the table
used by the decision tree algorithm. For example, in the process shown in Fig. 8.9,
there may be cases for which e is executed four times. The first three times e is be
followed by f and the fourth time e is followed by g or h. Each of the four decisions
corresponds to a row in the table used for classification. Using replay, the outcome
of the decision (i.e., the response variable) can be identified for each row. Also note
that the values of the predictor variables for these four rows may be different.
In some cases, it may be impossible to derive a reasonable decision rule. The
reason may be that there is too little data or that decisions are seemingly random or
based on considerations not in the event log. In such cases, replay can be used to
provide a probability for each branch. Hence, such a decision point is characterized
by probabilities rather than data dependent decision rules.
The procedure can be repeated for all decision points in a process model. The
results can be used to extend the process model, thus incorporating the case per-
spective.
Classification in Process Mining
The application of classification techniques like decision tree learning is not
limited to decision mining as illustrated by Figs. 8.14 and 8.15: additional
predictor variables maybeusedand alternative response variables can be
analyzed.
In Figs. 8.14 and 8.15, only attributes of events and cases are used as predictor
variables. However, also behavioral information can be used. For instance,
in Fig. 8.9 it would be interesting to count the number of times that f has
been executed. This may influence the decision point following activity e.For
example, it could be the case that a request is never initiated more than two
times. It may also be that timing information is used as a predictor variable.
For instance, if the time taken to check the ticket is less than five minutes, then
it is more likely that the request is rejected. It is also possible to use contextual