Page 198 -
P. 198

180                                 6  Advanced Process Discovery Techniques

            Fig. 6.18 Region
            R = (X,Y,c) corresponding
            to place p R : X ={a1,a2,
            c1}=•p R , Y ={b1,b2,
            c1}= p R •,and c is the initial
            marking of p R










            sentation function like l state () definitely results in an overfitting model that can only
                              1
            replay the log without any form of generalization. Many abstractions (i.e., state rep-
            resentation functions) are possible to balance between overfitting and underfitting.
            In [111], these are described systematically.



            6.4.3 Process Discovery Using Language-Based Regions

            As illustrated by Fig. 6.16, the goal of state-based regions is to determine the places
            in a Petri net. Language-based regions also aim at finding such places but do not
            use a transition system as input; instead some “language” is used as input. Several
            techniques and variants of the problem have been defined. In this section we only
            present the basic idea and refer to literature for details [7, 8, 18, 115].
              Suppose, we have an event log in which the events refer to a set of activities A.
            For this log one could construct a Petri net N ∅ with the set of transitions being A
            and no places. Since a transition without any input places is continuously enabled,
            this Petri net is able to reproduce the log. In fact, the Petri net N ∅ can reproduce
            any log over A. In Sect. 5.4.3 we referred to such a model as the “flower model”.
            There we added places and transitions to model this behavior in terms of a WF-net.
            However, the idea is the same. Adding places to the Petri net N ∅ can only limit the
            behavior.
              Consider for example place p R in Fig. 6.18. Removing place p R will not remove
            any behavior. However, adding p R may remove behavior possible in the Petri net
            without this place. The behavior gets restricted when a place is empty while one of
            its output transitions wants to consume a token from it. For example, b1 is blocked
            if p R is unmarked while all other input places of b1 are marked. Suppose now that
            we have a set of traces L. If these traces are possible in the net with place p R , then
            they are also possible in the net without p R . The reverse does not always hold. This
            triggers the question whether p R can be added without disabling any of the traces
            in L. This is what regions are all about.

                                                       ∗
            Definition 6.4 (Language-based region) Let L ∈ B(A ) be a simple event log. R =
            (X,Y,c) is a region of L if and only if:
   193   194   195   196   197   198   199   200   201   202   203