Page 247 -
P. 247

8.3 Organizational Mining                                       229

              In Table 8.3, we abstracted from transaction types, i.e., we did not consider the
            start and completion of an activity instance. Most logs will contain such information.
            For example, Table 8.1 shows the start and completion of each activity instance.
            Some logs will even show when a workitem is offered to a resource or when it is
            assigned. If such events are recorded, then a diagram such as Fig. 8.10 can also show
            detailed time related information. For example, the utilization and response times of
            resources can be shown.
              Assuming that the event log contains high quality information including precise
            timestamps and transaction types, the behavior of resources can be analyzed in de-
            tail [95]. Of course privacy issues play an important role here. However, the event
            log can be anonymized prior to analysis. Moreover, in most organizations one would
            like to do such analysis at an aggregate level rather than at the level of individuals.
            For instance, in Sect. 2.1, we mentioned the Yerkes–Dodson law of arousal which
            describes the relation between workload and performance of people. This law hy-
            pothesizes that people work faster when the workload increases. If the event log
            contains precise timestamps and transaction types, then it is easy to empirically in-
            vestigate this phenomenon. For any activity instance, one knows its duration and
            by scanning the log it is also easy to see what the workload was when the activity
            instance was being performed by some resource. Using supervised learning (e.g.,
            regression analysis or decision tree analysis), the effects of different workloads on
            service and response times can be measured. See [95] for more examples.


              Privacy and Anonymization
              Event logs may contain sensitive or private data. Events refer to actions and
              properties of customers, employees, etc. For instance, when applying process
              mining in a hospital it is important to ensure data privacy. It would be unac-
              ceptable that data about patients would be used by unauthorized persons or
              that event data about treatments would be used in a way not intended when
              releasing the data. The challenge in process mining is to use event logs to
              improve processes and information systems while protecting personally iden-
              tifiable information and not revealing sensitive data. Therefore, most event
              logs contain anonymized attribute values. For example, the name of the cus-
              tomer or employee is often irrelevant for questions that need to be answered.
              To make an attribute anonymous, the original value is mapped onto a new
              value in a deterministic manner. This ensures that one can correlate attributes
              in one event to attributes in another event without knowing the actual values.
              For instance, all occurrences of the name “Wil van der Aalst” are mapped onto
              “Q2T4R5R7X1Y9Z”. The mapping of the original value onto the anonymized
              value should be such that it is not easy (or even impossible) to compute the
              inverse of the mapping. Anonymous data can sometimes be de-anonymized
              by combining different data sources. For example, it is often possible to trace
              back an individual based on her birth date and the birth dates of her children.
              Therefore, even “anonymous data” should be handled carefully.
   242   243   244   245   246   247   248   249   250   251   252