Page 143 -
P. 143
Chapter 5
Process Discovery: An Introduction
Process discovery is one of the most challenging process mining tasks. Based on an
event log, a process model is constructed thus capturing the behavior seen in the log.
This chapter introduces the topic using the rather naïve α-algorithm. This algorithm
nicely illustrates some of the general ideas used by many process mining algorithms
and helps to understand the notion of process discovery. Moreover, the α-algorithm
serves as a stepping stone for discussing challenges related to process discovery.
5.1 Problem Statement
As discussed in Chap. 1, there are three types of process mining: discovery, con-
formance, and enhancement. Moreover, we identified various perspectives, e.g., the
control-flow perspective, the organizational or resource perspective, the data per-
spective, and the time perspective. In this chapter, we focus on the discovery task
and the control-flow perspective. This combination is often referred to as process
discovery. The general process discovery problem can be formulated as follows.
Definition 5.1 (General process discovery problem) Let L be an event log as de-
fined in Definition 4.3 or as specified by the XES standard (cf. Sect. 4.3). A process
discovery algorithm is a function that maps L onto a process model such that the
model is “representative” for the behavior seen in the event log. The challenge is to
find such an algorithm.
This definition does not specify what kind of process model should be gener-
ated, e.g., a BPMN, EPC, YAWL, or Petri net model. Moreover, event logs with
potentially many attributes may be used as input. Recall that the XES format allows
for storing information related to all perspectives whereas here the focus is on the
control-flow perspective. The only requirement is that the behavior is “representa-
tive”, but it is unclear what this means.
Definition 5.1 is rather broad and vague. The target format is not specified and
a potentially “rich” event log is used as input without specifying tangible require-
W.M.P. van der Aalst, Process Mining, 125
DOI 10.1007/978-3-642-19345-3_5, © Springer-Verlag Berlin Heidelberg 2011