Page 172 - Computational Retinal Image Analysis
P. 172

4  Annotations and data, annotations as data  167




                     The difference between deciding the consensus type during protocol design and
                     leaving it to the MIA team (i.e., after annotations are provided) is that the former
                     encapsulates consensus-achieving procedures in the protocol design stage, in which
                     all decisions regarding the generation of the ground truth take place, promoting
                     consistency and awareness to the whole interdisciplinary team. Consensus
                     measurements are best obtained by discussion among the annotators, typically in
                     cases where the disagreement is above a level defined as acceptable (a problem-
                     specific decision). Other, simple consensus measurements include the average (for
                     numerical values) and majority (for categorical or ordinal labels, given at least
                     three annotators). Descriptive statistics characterizing the disagreement among
                     annotators may also be provided, in the form of basic statistics (mean and standard
                     deviation of the signed and absolute differences), histograms or others.
                  •  Software tool. An appropriate software tool must be provided to the annotators,
                     selected or designed and developed to make the annotation task as efficient and
                     unambiguous as possible. It is again important to involve clinicians in the choice
                     or design of the software annotation tool.
                  •  Training sessions. Once a protocol has been agreed and a software tool
                     identified or created, the technical team should run training sessions to ensure
                     that the annotators follow the protocol consistently. Experience indicates that
                     such training sessions are valuable to avoid inconsistencies in the data which
                     may weaken the subsequent validation of the MIA algorithm.
                  •  How much detail? An annotation protocol must support the consistent
                     generation of a set of measurements by different annotators. We stress that it is
                     the procedure that must be consistent, not the measurements: there is important
                     information in the variability among annotators, assuming that they followed
                     the same procedure. If annotators make independent decisions or depart
                     from the protocol in various ways, random variations not related to the target
                     measurements are introduced in their annotations, weakening the validation of
                     MIA algorithms. It is critical to discuss these aspects with the clinical team.


                  4.2  Reducing the need for manual annotations
                  An important trend of contemporary research addresses techniques for limiting the
                  volume of annotations needed for validating a medical image analysis system main-
                  taining its accuracy and related performance parameters. Research aimed to reduce
                  the number of annotations needed is particularly important to achieve all-round au-
                  tomation on a large scale, given the unabating proliferation of deep learning systems
                  (artificial intelligence) where a typical network must train millions of parameters. We
                  refer the reader to recent papers [35–38] and to the related literature on automatic
                  annotations in computer vision [39, 40].
                     We note that validation on outcome (Section 3.3) can be regarded as a paradigm
                  for limiting the burden and so to some extent the volume of annotations, as it aims to
                  use information recorded anyway when seeing patients, instead of asking clinicians
                  for additional work like tracing contours on images.
   167   168   169   170   171   172   173   174   175   176   177