Page 357 -
P. 357

14.2  Challenges                                                339

            14.2 Challenges

            Existing process mining techniques and tools such as ProM are mature and can be
            applied to both Lasagna and Spaghetti processes. We have applied ProM in more
            than 100 organizations ranging from municipalities and hospitals to financial insti-
            tutions and manufacturers of high-tech systems. Despite the applicability of process
            mining there are many interesting challenges; these illustrate that process mining is
            a young discipline.
              Process discovery is probably the most important and most visible intellectual
            challenge related to process mining. As shown, it is far from trivial to construct a
            process model based on event logs that are incomplete and noisy. Unfortunately,
            there are still researchers and tool vendors that assume logs to be complete and free
            of noise. Although heuristic mining, genetic mining, and fuzzy mining (cf. Chap. 6)
            provide case-hardened process discovery techniques, many improvements are pos-
            sible to construct more intuitive 80/20 models, i.e., simple models that are able to
            explain the most likely/common behavior.
              New process mining approaches should reconsider the representational bias to
            be used. Almost all existing approaches use a graph-based notation that can rep-
            resent models that do not make much sense. WF-nets, BPMN models, EPCs, etc.
            can represent processes that are not sound, e.g., a process having a deadlock or an
            activity that can never be activated. The search space of a technique using such a rep-
            resentational bias is too large. For instance, the α-algorithm can discover WF-nets
            that are not sound and the heuristic miner and the genetic miner can discover C-nets
            that deadlock. Therefore, the representational bias of discovery techniques should
            be refined to only allow for sensible process models. Clearly, this is a challenging
            problem requiring new approaches and representations.
              Another challenge is the notion of concept drift, i.e., processes change while
            being observed. Existing process discovery approaches do not take such changes
            into account. It is interesting to detect when processes change and to visualize such
            changes.
              Process mining heavily depends to the ability to extract suitable event logs. The
            scope and granularity of an event log should match the questions one would like to
            answer. Unfortunately, in some information systems event data are just a byprod-
            uct for debugging or scattered over many tables. Some systems also “forget” events,
            e.g., when a record is updated, the old values are simply overwritten. Earlier we used
            the term business process provenance to stress the importance of recording events
            in such a way that history is recorded correctly and cannot be tampered with. Event
            logs should be “first-class citizens” rather than some byproduct. Data elements in
            events logs should have clear semantics. Therefore, developers should not simply
            insert write statements without a reference to a commonly agreed-upon ontology.
            We encountered systems where parts of the logging depend on the language setting.
            For example, depending on the language setting of the system, an event attribute
            may have value “Off” in English, “Uit” in Dutch, or “Aus” in German. Semanti-
            cally, these are all the same. However, such ad-hoc logging is making analysis more
            complex. Attributes of events and cases should refer to one or more ontologies that
   352   353   354   355   356   357   358   359   360   361   362