Page 142 - Artificial Intelligence for Computational Modeling of the Heart
P. 142

114  Chapter 3 Learning cardiac anatomy




                                         tation of all these tracking methods comes from the fact that the
                                         kernels or representations are ’engineered’ and may not capture
                                         enough deep insights of the images.
                                            In recent years, significant attention has been focused on the
                                         development of deep learning based tracking models. Compared
                                         with conventional methods, deep neural network models can ex-
                                         tract more informative features and have shown superior perfor-
                                         mance in various applications. Based on the network topology
                                         and methodology types, we organize them into three different cat-
                                         egories.
                                            Tracking with convolutional Neural Networks (CNNs). To ad-
                                         dress the aforementioned limitations from the ‘engineered’ rep-
                                         resentations of target objects, leveraging high level features from
                                         CNNs serves as a natural remedy. Siamese network [286]isoneof
                                         the most commonly used architectures for similarity-based track-
                                         ing. It processes two different inputs through the same network
                                         computations and provides a similarity score based on the ex-
                                         tracted features. One of the early work is due to Bertinetto et al.
                                         [287], who propose a fully convolutional Siamese network to find
                                         a target object in consecutive frames with a region-wise similarity
                                         measure. Similar strategies have been widely developed, includ-
                                         ing GOTURN tracker with box regression on targets [288], DSiam
                                         tracker with online Siamease network updating [289], variants of
                                         CFNET with add-on correlation filters [290,291] and different vari-
                                         ants of SiamRPN with region proposals after feature extraction
                                         [292–294]. Besides similarity learning based Siamese networks,
                                         different models considering domain and appearance changes
                                         have been studied. Nam et al. [295] proposed MDNet which learns
                                         a domain independent representation which encodes the mov-
                                         ing object and uses it for detection in the next frames. CREST
                                         [296] represents the discriminative correlation filter (DCF) [297,
                                         298] as convolution and applies residual learning to accommo-
                                         date appearance changes. Zhu et al. [299] also take optical flow
                                         information into account and proposed a model on correlation
                                         tracking with spatial-temporal attention. The application of such
                                         models in cardiac imaging is under development. Recently, Para-
                                         juli et al. [300] applied a flow network on left ventricle motion
                                         analysis where the motion is modeled as flow through graphs and
                                         the similarities between graph nodes are learned from a Siamese
                                         network.
                                            Tracking with Recurrent Neural Networks. It is found that re-
                                         current neural networks can well encode temporal state infor-
                                         mation and thus are effective for sequential data. Cui et al. [301]
                                         proposed a Recurrently Target-attending Tracking (RTT) model.
                                         It estimates a confidence map for object motion using a multi-
   137   138   139   140   141   142   143   144   145   146   147