Page 47 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 47

Chapter 2 Deep convolutional neural network in medical image processing  35




                  After the year 2012, the study of innovative models started off
               and recently in the past few years, there is interest in deeper ar-
               chitectures. Instead of using a single layer of kernel, having a
               large receptive field, an alike function can be represented by a
               lesser number of parameters. This can be done by stacking
               smaller kernels. These deeper designs usually have a low memory
               footprint during inference, which is helpful during the deploy-
               ment on mobile smart computing devices. Simonyan and Zisser-
               man [43] were the first to study on much deeper networks and
               employed fixed as well as small size kernels on each layer. In
               2014, a 19-layer system had won the ImageNet challenge and
               generally referred as OxfordNetor VGG19.
                  More complex blocks have been added on top of the deeper
               networks so as to bring an improvement in the performance of
               the training process as well as reduce the number of variables.
               Authors in Ref. [44] designed an architecture having 22-layer
               and named it as GoogleNet, which is also known as Inception
               that used the so-called inception blocks [45]. He et al. [46] intro-
               duced the ResNet architecture that won the ImageNet challenge
               in 2015 and comprised the ResNet blocks. The residual block's
               main purpose is to learn the residuals instead of learning a func-
               tion, and so it is preconditioned in the direction of learning map-
               ping in each of the layers that are closer to the identity function. In
               this way, even deeper architectures can be trained effectively.
                  Since 2014, the efficiency of the ImageNet standard has satu-
               rated, and it is now hard to assess if the small rise in efficiency
               can be attributed to having more complex and better models.
               The advantage of the lower memory footprint that is provided
               by the models is classically not so important for medical image
               applications. Popularly, AlexNet and other simple architectures
               such as VGG are still used for clinical data even though recently
               the version of GoogleNet called Inception v3 has been used by
               many researchers [47e49].


               3.1.2 Multistream architectures
                  The traditional CNN model is able to accommodate several
               sources of illustrations of the input, in the system of channels
               presented to the input layer without any difficulty. This basic
               idea could be extended more, and the channels at any point in
               the network can be combined. Considering the concept that
               different tasks involve different ways of merging, several multi-
               stream models are being discussed. These architectures some-
               times also termed as dual-pathway models [50] having two
               major applications that are appropriate for medical image
   42   43   44   45   46   47   48   49   50   51   52