Page 142 - Artificial Intelligence for Computational Modeling of the Heart

P. 142

114 Chapter 3 Learning cardiac anatomy

tation of all these tracking methods comes from the fact that the
kernels or representations are ’engineered’ and may not capture
enough deep insights of the images.
In recent years, signiﬁcant attention has been focused on the
development of deep learning based tracking models. Compared
with conventional methods, deep neural network models can ex-
tract more informative features and have shown superior perfor-
mance in various applications. Based on the network topology
and methodology types, we organize them into three different cat-
egories.
Tracking with convolutional Neural Networks (CNNs). To ad-
dress the aforementioned limitations from the ‘engineered’ rep-
resentations of target objects, leveraging high level features from
CNNs serves as a natural remedy. Siamese network [286]isoneof
the most commonly used architectures for similarity-based track-
ing. It processes two different inputs through the same network
computations and provides a similarity score based on the ex-
tracted features. One of the early work is due to Bertinetto et al.
[287], who propose a fully convolutional Siamese network to ﬁnd
a target object in consecutive frames with a region-wise similarity
measure. Similar strategies have been widely developed, includ-
ing GOTURN tracker with box regression on targets [288], DSiam
tracker with online Siamease network updating [289], variants of
CFNET with add-on correlation ﬁlters [290,291] and different vari-
ants of SiamRPN with region proposals after feature extraction
[292–294]. Besides similarity learning based Siamese networks,
different models considering domain and appearance changes
have been studied. Nam et al. [295] proposed MDNet which learns
a domain independent representation which encodes the mov-
ing object and uses it for detection in the next frames. CREST
[296] represents the discriminative correlation ﬁlter (DCF) [297,
298] as convolution and applies residual learning to accommo-
date appearance changes. Zhu et al. [299] also take optical ﬂow
information into account and proposed a model on correlation
tracking with spatial-temporal attention. The application of such
models in cardiac imaging is under development. Recently, Para-
juli et al. [300] applied a ﬂow network on left ventricle motion
analysis where the motion is modeled as ﬂow through graphs and
the similarities between graph nodes are learned from a Siamese
network.
Tracking with Recurrent Neural Networks. It is found that re-
current neural networks can well encode temporal state infor-
mation and thus are effective for sequential data. Cui et al. [301]
proposed a Recurrently Target-attending Tracking (RTT) model.
It estimates a conﬁdence map for object motion using a multi-

137 138 139 140 141 142 143 144 145 146 147