Page 197 - Artificial Intelligence for Computational Modeling of the Heart
P. 197

Chapter 5 Machine learning methods for robust parameter estimation  169




















                     Figure 5.4. Measured and computed ECG traces for one representative cases
                                                           ◦
                     (estimation errors of 1.6 ms for QRS duration and 0.5 for electrical axis).






                     Figure 5.5. Framework overview: self-taught artificial model personalization agent.


                     where the objective is to compute a direct mapping from a given
                     input to a prediction, RL aims to learn how to perform tasks. This
                     is achieved by computing an optimal strategy, called “policy”, to
                     solve a given problem. A policy describes a mapping from states,
                     i.e., the current “situation” the agent finds itself in, to actions,
                     which allow the agent to interact with the environment. Rewards
                     allow the agent to judge the outcome of its actions.
                        The past few years saw tremendous breakthroughs in RL for
                     complex, real-world problems, with application to medical imag-
                     ing [383]. One instance is presented in chapter 3, where an RL
                     agent learns how to detect landmarks in 3D medical images, while
                     being faster, more accurate and more generic than alternative al-
                     gorithms.
                        Motivated by these recent successes, we proposed an RL-based
                     personalization approach called Vito [384], a class of artificial
                     agents that learn by themselves how to estimate model parame-
                     ters from clinical data while being model-independent. First, in
                     an off-line data-driven exploration phase, Vito assimilates the be-
                     havior of physiological models (Fig. 5.5). Based on the gathered
                     knowledge, Vito then learns an optimal personalization strategy
                     using RL. Once the off-line phase has converged, given a new,
                     unseen dataset, Vito sequentially chooses actions that maximize
                     future rewards, i.e., that bring the agent into a state that repre-
                     sents the solution of the personalization problem. To set up the
                     algorithm, the user has to define the observations to be matched,
   192   193   194   195   196   197   198   199   200   201   202