Page 116 -
P. 116

98                                                  4 Getting the Data

            may be interested in the discovery of patient flows, i.e., typical diagnosis and treat-
            ment paths. However, one may also be interested in optimizing the workflow within
            the radiology department. Both questions require different event logs, although
            some events may be shared among the two required event logs. Once an event log
            is created, it is typically filtered. Filtering is an iterative process. Coarse-grained
            scoping was done when extracting the data into an event log. Filtering corresponds
            to fine-grained scoping based on initial analysis results. For example, for process
            discovery one can decide to focus on the 10 most frequent activities to keep the
            model manageable.
              Based on the filtered log, the different types of process mining described in
            Sect. 1.3 can be applied: discovery, conformance, and enhancement.
              Although Fig. 4.1 does not reflect the iterative nature of the whole process well,
            it should be noted that process mining results most likely trigger new questions and
            these questions may lead to the exploration of new data sources and more detailed
            data extractions. Typically, several iterations of the extraction, filtering, and mining
            phases are needed.



            4.2 Event Logs

            Table 4.1 shows a fragment of the event log already discussed in Chap. 1.This
            table illustrates the typical information present in an event log used for process
            mining. The table shows events related to the handling of requests for compensa-
            tion. We assume that an event log contains data related to a single process, i.e., the
            first coarse-grained scoping step in Fig. 4.1 should make sure that all events can be
            related to this process. Moreover, each event in the log needs to refer to a single pro-
            cess instance, often referred to as case. In Table 4.1, each request corresponds to a
            case, e.g., Case 1. We also assume that events can be related to some activity.InTa-
            ble 4.1, events refer to activities like register request, check ticket, and reject. These
            assumptions are quite natural in the context of process mining. All mainstream pro-
            cess modeling notations, including the ones discussed in Chap. 2, specify a process
            as a collection of activities such that the life-cycle of a single instance is described.
            Hence, the “case id” and “activity” columns in Table 4.1 represent the bare mini-
            mum for process mining. Moreover, events within a case need to be ordered. For
            example, event 35654423 (the execution of activity register request for Case 1) oc-
            curs before event 35654424 (the execution of activity examine thoroughly for the
            same case). Without ordering information, it is of course impossible to discover
            causal dependencies in process models.
              Table 4.1 also shows additional information per event. For example, all events
            have a timestamp (i.e., date and time information such as “30-12-2010:11.02”). This
            information is useful when analyzing performance related properties, e.g., the wait-
            ing time between two activities. The events in Table 4.1 also refer to resources, i.e.,
            the persons executing the activities. Also costs are associated to events. In the con-
            text of process mining, these properties are referred to as attributes. These attributes
            are similar to the notion of variables in Chap. 3.
   111   112   113   114   115   116   117   118   119   120   121