Page 201 - Artificial Intelligence for Computational Modeling of the Heart
P. 201

Chapter 5 Machine learning methods for robust parameter estimation  173















                     Figure 5.6. Probabilistic on-line personalization phase.


                     5.3.2 Parameter estimation using Reinforcement
                           Learning
                        On-line personalization of unseen patients, as illustrated in
                     Fig. 5.6, can be seen as a two-step procedure. First, we apply a
                     data-driven initialization step to determine an initial guess for the
                     parameters to be personalized. Second, we rely on the computed
                                      ∗
                     stochastic policy ˜π to perform the personalization.

                     5.3.2.1 Data-driven initialization
                        To initialize the personalization procedure, we rely again on the
                     knowledge obtained during exploration, stored in E. First, we look
                     for episode steps, where the resulting model state y is similar to
                     the current patient’s measurements z. Then we identify the input
                     parameters x that led to these results: {x ∈ E | f(x) ≈ z}, and ana-
                     lyze their distribution, which could be multi-modal due to ambi-
                     guities induced by the different training patients, noise in the data,
                     modeling assumptions, etc. In particular, we cluster the identified
                     parameter vectors according to their similarity, compute one rep-
                     resentative per cluster, and rank the clusters based on their size in
                     descending order. The cluster representatives serve as initializa-
                     tion candidates.

                     5.3.2.2 Probabilistic personalization
                        As illustrated in Fig. 5.6, from the most likely initial parame-
                     ters x 0 ∈{x 0 },the forwardmodel y = f(x 0 ) is run and the misfit
                                                    0
                     between the model output and the patient’s measurements c 0 =
                     c(y ,z) is computed to derive the first state s 0 = φ(c 0 ).Given s 0 ,
                        0
                     the policy is queried to decide the first action to take a 0 =˜π (s 0 ).
                                                                           ∗

                     The process runs through state-action-state sequences to person-
                     alize the computational model f by iteratively updating the model
                     parameters through MDP actions. Bad initialization could lead
                     to oscillations between states as reported in literature [385,386],
                     which can be detected by monitoring the parameter traces to de-
                     tect recurring sets of parameter values. If that happens, the per-
   196   197   198   199   200   201   202   203   204   205   206