Page 100 -
P. 100
82 3 Data Mining
Fig. 3.12 A hidden Markov model with three states: s1, s2, and s3. The arcs have state transi-
tion probabilities as shown, e.g., in state s2 the probability of moving to state s3is 0.2 and the
probability of moving to state s1 is 0.8. Each visit to a state generates an observation. The ob-
servation probabilities are also given. When visiting s2 the probability of observing b is 0.6 and
the probability of observing c is 0.4. Possible observation sequences are a,b,c,d
, a,b,b,c
,
and a,b,c,b,b,a,c,e
. For the observation sequence a,b,c,d
, it is clear what the hidden se-
quence is: s1,s2,s2,s3
. For the other two observation sequences, multiple hidden sequences are
possible
accurate models are typically large and even for small examples the interpretation
of the states is difficult. Clearly, hidden Markov models are at a lower abstraction
level than the notations discussed in Chap. 2.
3.6 Quality of Resulting Models
This chapter provided an overview of the mainstream data mining techniques most
relevant for process mining. Although some of these techniques can be exploited for
process mining, they cannot be used for important process mining tasks such as pro-
cess discovery, conformance checking, and process enhancement. However, there is
an additional reason for showing a variety of data mining techniques. Like in data
mining it is non-trivial to analyze the quality of process mining results. Here one can
benefit from experiences in the data mining field. Therefore, we discuss some of the
validation and evaluation techniques developed for the algorithms presented in this
chapter. First, we focus on the quality of classification results, e.g., obtained through
a decision tree. Second, we describe general techniques for cross-validation. Here,