Page 201 - Artificial Intelligence for Computational Modeling of the Heart

P. 201

Chapter 5 Machine learning methods for robust parameter estimation 173

Figure 5.6. Probabilistic on-line personalization phase.

5.3.2 Parameter estimation using Reinforcement
Learning
On-line personalization of unseen patients, as illustrated in
Fig. 5.6, can be seen as a two-step procedure. First, we apply a
data-driven initialization step to determine an initial guess for the
parameters to be personalized. Second, we rely on the computed
∗
stochastic policy ˜π to perform the personalization.

5.3.2.1 Data-driven initialization
To initialize the personalization procedure, we rely again on the
knowledge obtained during exploration, stored in E. First, we look
for episode steps, where the resulting model state y is similar to
the current patient’s measurements z. Then we identify the input
parameters x that led to these results: {x ∈ E | f(x) ≈ z}, and ana-
lyze their distribution, which could be multi-modal due to ambi-
guities induced by the different training patients, noise in the data,
modeling assumptions, etc. In particular, we cluster the identiﬁed
parameter vectors according to their similarity, compute one rep-
resentative per cluster, and rank the clusters based on their size in
descending order. The cluster representatives serve as initializa-
tion candidates.

5.3.2.2 Probabilistic personalization
As illustrated in Fig. 5.6, from the most likely initial parame-
ters x 0 ∈{x 0 },the forwardmodel y = f(x 0 ) is run and the misﬁt
0
between the model output and the patient’s measurements c 0 =
c(y ,z) is computed to derive the ﬁrst state s 0 = φ(c 0 ).Given s 0 ,
0
the policy is queried to decide the ﬁrst action to take a 0 =˜π (s 0 ).
∗

The process runs through state-action-state sequences to person-
alize the computational model f by iteratively updating the model
parameters through MDP actions. Bad initialization could lead
to oscillations between states as reported in literature [385,386],
which can be detected by monitoring the parameter traces to de-
tect recurring sets of parameter values. If that happens, the per-

196 197 198 199 200 201 202 203 204 205 206