Page 146 -
P. 146

128                                    5  Process Discovery: An Introduction


































            Fig. 5.3 Two BPMN models: (a) the model corresponding to WF-net N 1 discovered for L 1 ,and
            (b) the model corresponding to WF-net N 2 discovered for L 2


            two trace equivalent BPMN models shown in Fig. 5.3. Similarly, the discovered
            models could have been translated into equivalent EPCs, UML activity diagrams,
            statecharts, YAWL models, BPEL specifications, etc.
              In the general problem formulation (Definition 5.1), we stated that the discovered
            model should be “representative” for the behavior seen in the event log. In Defini-
            tion 5.2, this was operationalized by requiring that the model is able to replay all
            behavior in this log, i.e., any trace in the event log is a possible firing sequence of
            the WF-net. This is the so-called “fitness” requirement. In general, there is a trade-
            off between the following four quality criteria:
            • Fitness: the discovered model should allow for the behavior seen in the event log.
            • Precision: the discovered model should not allow for behavior completely unre-
              lated to what was seen in the event log.
            • Generalization: the discovered model should generalize the example behavior
              seen in the event log.
            • Simplicity: the discovered model should be as simple as possible.
            A model having a good fitness is able to replay most of the traces in the log. Preci-
            sion is related to the notion of underfitting presented in the context of data mining
            (see Sect. 3.6.3). A model having a poor precision is underfitting, i.e., it allows for
            behavior that is very different from what was seen in the event log. Generaliza-
   141   142   143   144   145   146   147   148   149   150   151