Page 198 -

P. 198

180 6 Advanced Process Discovery Techniques

Fig. 6.18 Region
R = (X,Y,c) corresponding
to place p R : X ={a1,a2,
c1}=•p R , Y ={b1,b2,
c1}= p R •,and c is the initial
marking of p R

sentation function like l state () deﬁnitely results in an overﬁtting model that can only
1
replay the log without any form of generalization. Many abstractions (i.e., state rep-
resentation functions) are possible to balance between overﬁtting and underﬁtting.
In [111], these are described systematically.

6.4.3 Process Discovery Using Language-Based Regions

As illustrated by Fig. 6.16, the goal of state-based regions is to determine the places
in a Petri net. Language-based regions also aim at ﬁnding such places but do not
use a transition system as input; instead some “language” is used as input. Several
techniques and variants of the problem have been deﬁned. In this section we only
present the basic idea and refer to literature for details [7, 8, 18, 115].
Suppose, we have an event log in which the events refer to a set of activities A.
For this log one could construct a Petri net N ∅ with the set of transitions being A
and no places. Since a transition without any input places is continuously enabled,
this Petri net is able to reproduce the log. In fact, the Petri net N ∅ can reproduce
any log over A. In Sect. 5.4.3 we referred to such a model as the “ﬂower model”.
There we added places and transitions to model this behavior in terms of a WF-net.
However, the idea is the same. Adding places to the Petri net N ∅ can only limit the
behavior.
Consider for example place p R in Fig. 6.18. Removing place p R will not remove
any behavior. However, adding p R may remove behavior possible in the Petri net
without this place. The behavior gets restricted when a place is empty while one of
its output transitions wants to consume a token from it. For example, b1 is blocked
if p R is unmarked while all other input places of b1 are marked. Suppose now that
we have a set of traces L. If these traces are possible in the net with place p R , then
they are also possible in the net without p R . The reverse does not always hold. This
triggers the question whether p R can be added without disabling any of the traces
in L. This is what regions are all about.

∗
Deﬁnition 6.4 (Language-based region) Let L ∈ B(A ) be a simple event log. R =
(X,Y,c) is a region of L if and only if:

193 194 195 196 197 198 199 200 201 202 203