Page 120 -
P. 120
102 4 Getting the Data
Fig. 4.4 Transactional events for five activity instances
Fig. 4.5 Two scenarios involving two activity instance leaving the same footprint in the log
Events can have many attributes. We often refer to the event by its activity name.
Technically this is not correct. There may be many events that refer to the same
activity name. Within a case, these events may refer to the same activity instance
(e.g., start and complete events) or different activity instances (e.g., in a loop). This
distinction is particularly important when measuring service times, waiting times,
etc. Consider, for example, the scenario in which the same activity is started twice
for the same case, i.e., two activity instances are running in parallel, and then one of
them completes. Did the activity that was started first complete or the second one?
Figure 4.5 illustrates the dilemma. Given the footprint of two starts followed by two
completes of the same activity, there are two possible scenarios. In one scenario,
the durations of the two activity instances are 5 and 6. In the other scenario, the
durations of the activity instances are 9 and 2. Yet they leave the same footprint in
the event log.
This problem can be addressed by adding information to the log or by using
heuristics. This can be seen as a “secondary correlation problem”, i.e., relating two
events within the same case. The primary correlation problem is to relate events
to cases, i.e., process instances [39]. Figure 4.5 shows that even within one case
there may be the need to correlate events because they belong to the same activity
instance. When implementing systems, such information can easily be added to the
logs; just provide an activity instance attribute to keep track of this. When dealing
with existing systems this is not as simple as it seems. For example, when correlating
messages between organizations there may be the need to scan the content of the
message to find a suitable identifier (e.g., address or name). It is also possible to
use heuristics to resolve most problems, e.g., in Fig. 4.5 one could just assume
a first-in-first-out order and pick the first scenario. Moreover, one may introduce
timeouts when the time between a start event and complete event is too long. For