Page 114 -
P. 114

96                                                  4 Getting the Data



















































            Fig. 4.1 Overview describing the workflow of getting from heterogeneous data sources to process
            mining results


            data. Consider, for example, a full SAP implementation that typically has more than
            10,000 tables. Data may be scattered due to technical or organizational reasons. For
            example, there may be legacy systems holding crucial data or information systems
            used only at the departmental level. For cross-organizational process mining, e.g.,
            to analyze supply chains, data may even be scattered over multiple organizations.
            Events can also be captured by tapping of message exchanges [107] (e.g., SOAP
            messages) and recording read and write actions [36]. Data sources may be struc-
            tured and well-described by meta data. Unfortunately, in many situations, the data is
            unstructured or important meta data is missing. Data may originate from web pages,
   109   110   111   112   113   114   115   116   117   118   119