Page 254 -
P. 254

236                                       8  Mining Additional Perspectives

            “y and z” in all other cases. Based on this classification, the conditions shown in the
            YAWL and BPMN models can be derived.
              For the predictor variables, all case and event attributes can be used. Consider for
            instance the decision point following activity e and event 35654431 in Table 8.1.
            The case and event attributes of this event are shown in Tables 8.1 and 8.2. Hence,
            predictor variables for event 35654431 are: case = 1, activity = decide, time =
            06-01-2011:11.22, resource = Sara, trans = complete, cost = 200, custid = 9911,
            name = Smith, type = gold, region = south, and amount = 989.50. As described in
            [79] also the attributes of earlier and later events can be taken into account. For ex-
            ample, all attributes of all events in the trace up to the decision moment can be used.
            In the process shown in Fig. 8.9 one could find the rule that all cases that involve
            Sean get rejected in the decision point following activity e.
              There may be loops in the model. Hence, the same decision point may be visited
            multiple times for the same case. Each visit corresponds to a new row in the table
            used by the decision tree algorithm. For example, in the process shown in Fig. 8.9,
            there may be cases for which e is executed four times. The first three times e is be
            followed by f and the fourth time e is followed by g or h. Each of the four decisions
            corresponds to a row in the table used for classification. Using replay, the outcome
            of the decision (i.e., the response variable) can be identified for each row. Also note
            that the values of the predictor variables for these four rows may be different.
              In some cases, it may be impossible to derive a reasonable decision rule. The
            reason may be that there is too little data or that decisions are seemingly random or
            based on considerations not in the event log. In such cases, replay can be used to
            provide a probability for each branch. Hence, such a decision point is characterized
            by probabilities rather than data dependent decision rules.
              The procedure can be repeated for all decision points in a process model. The
            results can be used to extend the process model, thus incorporating the case per-
            spective.



              Classification in Process Mining
              The application of classification techniques like decision tree learning is not
              limited to decision mining as illustrated by Figs. 8.14 and 8.15: additional
              predictor variables maybeusedand alternative response variables can be
              analyzed.
              In Figs. 8.14 and 8.15, only attributes of events and cases are used as predictor
              variables. However, also behavioral information can be used. For instance,
              in Fig. 8.9 it would be interesting to count the number of times that f has
              been executed. This may influence the decision point following activity e.For
              example, it could be the case that a request is never initiated more than two
              times. It may also be that timing information is used as a predictor variable.
              For instance, if the time taken to check the ticket is less than five minutes, then
              it is more likely that the request is rejected. It is also possible to use contextual
   249   250   251   252   253   254   255   256   257   258   259