Page 48 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 48
36 Chapter 2 Deep convolutional neural network in medical image processing
analysis. One of them is multiscale image analysis, and another is
the 2.5D classification.
Context is often a significant sign of abnormality detection.
One of the ways to increase context is by feeding bigger patches
to the network. But doing so may increase the requirement of
memory or even the number of parameters of a network. Subse-
quently, different models have been studied in which context is
added in a downscaled illustration, which results in the high res-
olution of the local information. The initial multistream multi-
scale architecture was first studied by Farabet et al. [51], where
the author used it for the purpose of segmentation in natural im-
ages. This architecture has also been used for different medical
applications [50,52,53].
In the earlier applications of CNN to a huge amount of data,
full 3D convolutions and the resultant large number of variables
were avoided by dividing the interested volume into different sli-
ces that are fed as several streams to a network. Prasoon et al. [54]
were the first to implement this concept for the segmentation of
knee cartilage. Likewise, the network can also be fed with several
angled patches from 3D space in a multistream fashion. This
concept has been applied by different authors in the context of
medical image analysis [55,56], and the approaches are also
known as 2.5D classification.
3.1.3 Segmentation architectures
A common job in medical as well as natural image processing
is the process of segmentation. To handle this task, CNN can be
chosen for individually categorizing pixel in the given image by
representing it with patches that are extracted around the partic-
ular pixel. Here the input patches from neighboring pixels have a
large overlap, and the same convolutions are evaluated multiple
times, which is the major drawback of the so-called “sliding-
window” approach. On the other side, the advantage of the
approach is that as both the convolution and dot product are
used and as both are linear operators, the inner product can be
represented as convolutions and vice versa. CNN can select input
images larger than it was trained on and then generate a likeli-
hood map instead of an output for a single pixel. This can be
done by representing the fully connected layers as the convolu-
tions. The resulting network can be used for an entire image in
an effective way.
Due to the presence of the pooling layer, it can give a resultant
output, which has far low resolution than the input. The pro-
posed method so as to prevent the low-resolution problem is