Page 134 - Computational Retinal Image Analysis
P. 134

5  Deep learning based methods   127




                  5  Deep learning based methods

                  Since 2017 a number of methods have been described for retinal cell layer prediction
                  in OCT using deep learning methods. Two major approaches have been identified
                  to approach this task. The first treats all locations in the OCT as a prediction task,
                  where pixels are associated to a retinal layer class directly. This is the canonical seg-
                  mentation task. The second approach identifies boundaries between layers without
                  identifying the classes. These methods need an additional step to, on the one hand,
                  extract the real boundary from a probability map and, on the other hand, to identify
                  the classes of the separated layers.
                     The number and types of segmented retinal layers vary significantly between
                  published work (see Fig. 1 for an overview of possible layers). While the definitions
                  of the layers remain the same, various works bundle layers together, commonly NFL
                  to BM as total retinal thickness or RPE and PR.



                  5.1  Preprocessing and augmentation
                  To reduce variability in OCT data, some methods apply image preprocessing. A com-
                  mon preprocessing is flattening of the B-scan, by identifying the Bruch’s membrane
                  and rolling each A-scan to a predetermined vertical position [20–23].
                     As is common in medical data, images with an annotated ground truth are scarce.
                  Authors propose to augment the images to make the trained networks more robust to
                  variability: rolling using a parabola, rotation around the center, changes in illumina-
                  tion, vertical and horizontal translation, scale changes, horizontal flip, mild shearing,
                  additive noise and Gaussian blur. Augmentations are kept realistic in terms of how an
                  OCT device might deteriorate or change an acquired image (Fig. 5).

                  5.2  Pixelwise semantic segmentation methods

                  Pixelwise semantic image segmentation methods assign each pixel/voxel of an OCT
                  images a layer class. Most proposed deep learning methods make use of a U-Net
                  [24] or variations of it, with different kinds of convolutions, up- and downsampling,
                  depth, dropout, residuals and/or batch normalization. In the following we present
                  recent advancements.
                     The ReLayNet by Roy et al. [25] modifies the U-Net by replacing the deconvo-
                  lution decoder-branch with an unpooling layer. It uses the indices from the encoder
                  max pool layer to upscale the image in these positions, while filling the remaining
                  gaps with zeros. Instead of training on full B-scan, the authors propose vertical slices
                  of a constant width (bands), to be able to train in larger mini-batch sizes at the cost
                  of losing context. A weighted multi-class logistic loss in addition with a smooth dice
                  loss is used to optimize the network. Layer boundaries receive a higher weight, to
                  focus training on hard-to-identify tissue transitions/borders. As a preprocessing step,
                  the RPE is extracted using traditional methods, to vertically align all A-scans (flat-
                  tening). ReLayNet segments seven retinal layers and fluids. Ben-Cohen et al. [20]
   129   130   131   132   133   134   135   136   137   138   139