Page 112 - Computational Retinal Image Analysis

P. 112

3 Vessel segmentation 105

Instead of segmenting vessel pixels directly, a radial projection method is utilized
in Ref. [84] to locate centerlines, based on steerable wavelet filtered output of a
fundus image. Meanwhile, a semisupervised learning step extracts the remaining
vessel structures. Similarly, an iterative algorithm is introduced by Gu and Cheng
[85], where starting from the main vessel trunk, small and thin vessel branches
and difficult regions around the boundaries are iteratively refined by retrieving and
executing a proper local gradient Boosting classifiers, which is stored in a latent
classification tree model constructed by Chow-Liu method.

3.3 Deep learning
As a particular supervised segmentation paradigm, a number of end-to-end deep
learning approaches have been developed in our problem scenario with notable
performance improvements. The N4-field [86], as an example, combines the CNN
with nearest neighbor search to detect local patches of natural edges as well as
thin and long objects. A CNN model is also trained in Ref. [87] based on 400,000
preprocessed image patches obtained from the training sets of DRIVE, STARE, or
CHASEDB1 datasets, which are essential for ensuring reliable test performance.
Similarly, a deep neural network is used by Li et al. [88] to model the cross-modality
data transform for vessel segmentation in retinal images. Fu et al. [89] propose to
incorporate both CNN and CRF models for segmenting the full input fundus image.
The deep learning approach of Yan et al. [90] incorporates both conventional pixel-
wise loss and segment-level loss terms to achieve a globally plausible solution. In
addition, as demonstrated with the impressive performance in Ref. [91], it is possible
to address multiple tasks jointly, including vessel segmentation and, for example,
optic disk detection.
There exists, however, one major issue with the current segmentation task. DRIVE
(40 annotated images) and STARE (20 annotated images) datasets have been the de
facto benchmark in empirical examination of retinal image segmentation techniques.
Meanwhile, annotations from, DRIVE and STARE are from multiple medical experts,
which result in different reference (“gold”) standards. The disagreement from multiple
experts may be attributed to the inevitable subjectiveness and variability of expert
judgment. It also reflects the difficulty level of the task and the reliability of the gold
standard in the context. As our algorithms are approaching human-level performance,
it becomes increasingly difficult to quantitatively validate new algorithms, as the
vessels produced by these methods on test images are more agreeable to the particular
reference standard they are trained with, when comparing with the other reference
standards. This suggests an overall performance saturation of the current benchmark,
most possibly due to the small dataset size, and stresses the need for larger benchmark
datasets. Some steps have been taken toward addressing this concern. Among them,
it is shown in Ref. [92] that the retinal segmentation performance can be improved
by incorporated (potentially infinitely many) synthesized images into the typically
small training set. An even more challenging problem is performing fundus image
segmentation on a new and distinct fundus image dataset in the absence of manual

107 108 109 110 111 112 113 114 115 116 117