Page 357 -
P. 357
14.2 Challenges 339
14.2 Challenges
Existing process mining techniques and tools such as ProM are mature and can be
applied to both Lasagna and Spaghetti processes. We have applied ProM in more
than 100 organizations ranging from municipalities and hospitals to financial insti-
tutions and manufacturers of high-tech systems. Despite the applicability of process
mining there are many interesting challenges; these illustrate that process mining is
a young discipline.
Process discovery is probably the most important and most visible intellectual
challenge related to process mining. As shown, it is far from trivial to construct a
process model based on event logs that are incomplete and noisy. Unfortunately,
there are still researchers and tool vendors that assume logs to be complete and free
of noise. Although heuristic mining, genetic mining, and fuzzy mining (cf. Chap. 6)
provide case-hardened process discovery techniques, many improvements are pos-
sible to construct more intuitive 80/20 models, i.e., simple models that are able to
explain the most likely/common behavior.
New process mining approaches should reconsider the representational bias to
be used. Almost all existing approaches use a graph-based notation that can rep-
resent models that do not make much sense. WF-nets, BPMN models, EPCs, etc.
can represent processes that are not sound, e.g., a process having a deadlock or an
activity that can never be activated. The search space of a technique using such a rep-
resentational bias is too large. For instance, the α-algorithm can discover WF-nets
that are not sound and the heuristic miner and the genetic miner can discover C-nets
that deadlock. Therefore, the representational bias of discovery techniques should
be refined to only allow for sensible process models. Clearly, this is a challenging
problem requiring new approaches and representations.
Another challenge is the notion of concept drift, i.e., processes change while
being observed. Existing process discovery approaches do not take such changes
into account. It is interesting to detect when processes change and to visualize such
changes.
Process mining heavily depends to the ability to extract suitable event logs. The
scope and granularity of an event log should match the questions one would like to
answer. Unfortunately, in some information systems event data are just a byprod-
uct for debugging or scattered over many tables. Some systems also “forget” events,
e.g., when a record is updated, the old values are simply overwritten. Earlier we used
the term business process provenance to stress the importance of recording events
in such a way that history is recorded correctly and cannot be tampered with. Event
logs should be “first-class citizens” rather than some byproduct. Data elements in
events logs should have clear semantics. Therefore, developers should not simply
insert write statements without a reference to a commonly agreed-upon ontology.
We encountered systems where parts of the logging depend on the language setting.
For example, depending on the language setting of the system, an event attribute
may have value “Off” in English, “Uit” in Dutch, or “Aus” in German. Semanti-
cally, these are all the same. However, such ad-hoc logging is making analysis more
complex. Attributes of events and cases should refer to one or more ontologies that