Page 303 -
P. 303
11.3 Approach 285
For an organization without much process mining experience, it is best to start with
a question-driven project. Concrete questions help to scope the project and guide
data extraction efforts.
Like any project, a process mining project needs to be planned carefully. For
instance, activities need to be scheduled before starting the project, resources need
to be allocated, milestones need to be defined, and progress needs to be monitored
continuously.
11.3.2 Stage 1: Extract
After initiating the project, event data, models, objectives, and questions need to be
extracted from systems, domain experts, and management.
In Chap. 4, we elaborated on data extraction. For example, Fig. 4.1 describes the
process of getting from raw data to suitable event logs. Recall that event logs have
two main requirements: (a) events need to be ordered in time and (b) events need to
be correlated (i.e., each event needs to refer to a particular case).
As Fig. 11.6 shows, it is possible that there are already handmade (process) mod-
els. These models may be of low quality and have little to do with reality. Never-
theless, it is good to collect all models present and exploit existing knowledge as
much as possible. For example, existing models can help in scoping the process and
judging the completeness of event logs.
In a goal-driven process mining project, the objectives are also formulated in
Stage 1 of the L life-cycle. These objectives are expressed in terms of KPIs. In a
∗
question-driven process mining project, questions need to be generated in Stage 1.
Both questions and objectives are gathered through interviews with stakeholders
(e.g., domain experts, end users, customers, and management).
11.3.3 Stage 2: Create Control-Flow Model and Connect Event
Log
Control-flow forms the backbone of any process model. Therefore, Stage 2 of the
∗
L life-cycle aims to determine the de facto control-flow model of the process that
is analyzed. The process model may be discovered using the process discovery tech-
niques presented in Part II of this book (activity discover in Fig. 11.6). However, if
there is a good process model present, it may be verified using conformance check-
ing (activity check) or judged against the discovered model (activity compare). It
is even possible to merge the handmade model and the discovered model (activity
promote). After completing Stage 2 there is a control-flow model tightly connected
to the event log, i.e., events in the event log refer to activities in the model. As dis-
cussed in Sect. 7.4.3, this connection is crucial for subsequent steps. If the fitness
of the model and log is low (say below 0.8), then it is difficult to move to Stage 3.
However, by definition, this should not be a problem for a Lasagna process.