Page 21 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 21
Chapter 1 Congruence of deep learning in biomedical engineering 9
• Consider a convolutional layer that is comprised entirely of 3 3
3 filters. The total quantity of parameters in this layer is:
(Number of input channels) (Number of filters) (3 3)
• We can decrease the number of input channels to 3 3 filters
using squeeze layers, mentioned in Section 2.
Strategy 3. Downsample late in the network so that convolu-
tion layers have large activation maps
• The intuition is that large activation maps (due to delayed
downsampling) can lead to higher classification accuracy.
• Strategies 1 and 2 concern carefully reducing the number of
parameters in the CNN while trying to reserve accuracy.
• Strategy 3 almost maximizes the accuracy on a restricted
budget of the parameters.
2. Fire module
• A Fire module is composed of a squeeze convolutional layer
(which has only 1 1 filters), feeding into an expand layer
that has a mix of 1 1 and 3 3 convolutional filters.
• There are three tunable dimensions (hyperparameters) in a
Fire module: s1 1, e1 1, and e3 3.
• s1 1: The number of 1 1s in a squeeze layer.
• e1 1 and e3 3: The number of 1 1s and 3 3s in an
expand layer.
• When we use Fire modules we set s1 1 to be less than (e1
1 þ e3 3), so the squeeze layer helps to limit the number of
input channels to the 3 3 filters, as per strategy 2 in the pre-
vious section.
• The number of filters per Fire module is gradually increased
from the beginning to the end of the network.
• SqueezeNet (left): Begins with a standalone convolution layer
(conv1), followed by eight Fire modules (fire2e9), ending
with a final convolutional layer (conv10).
• Max-pooling with a stride of 2 is performed after layers conv1,
fire4, fire8, and conv10.
• SqueezeNet with simple bypass (middle) and SqueezeNet with
complex bypass (right): The use of bypass is inspired by the ar-
chitecture of ResNet.
• With the use of Fire module, model size can be reduced while
maintaining the prediction accuracy (Figs. 1.3e1.6).
• With the architecture of SqueezeNet, we achieve a 50 reduc-
tion in model size compared to AlexNet, while meeting or
exceeding the top 1 and top 5 accuracy of AlexNet.