Page 195 - Computational Retinal Image Analysis
P. 195
190 CHAPTER 10 Statistics in ophthalmology
Another possibility is that the authors are encouraged to use online supplements (if
necessary) as a way of publishing both the details of the missing data in their study
and the details of the methods used to deal with the missing data. A cohesive sum-
mary of this approach to the missing data is the flowchart in Ref. [18] which was
designed with a clinical trial in mind, but applied to observational studies too.
6 Designing an ophthalmic study
6.1 Study designs, sample size calculation and power analysis
There is a large amount of literature written on designs and the pyramid of the evi-
dence, for example: Refs. [4, 38]. In what follows, we will focus on a related chal-
lenging question: on how to do sample size calculations.
Why do we do sample size calculations, or in other words, why do we know how
many patients we need to recruit? If we recruit too small number of patients, then our data
will not allow us to make conclusions, which is not a good use of resources and patients
and hence not ethical. If we recruit too many patients, then we use too many patients and
hence do not use resources economically and may unnecessarily expose patients to harm-
ful medication (e.g. in a safety study), which is not ethical either [39].
To determine the sample size means to find a minimum number of patients so that
the data collected will provide enough evidence to support our investigation in the pres-
ence of the uncertainty that surrounds our investigation. This complex statement consist
from several points. It explicitly states that the sample size depends on what we believe
would be the sufficient level of evidence i.e. the level of significance, often denoted as
α. Secondly, the sample size depends on the research question (i.e. the null and alterna-
tive hypothesis or whether we doing a diagnostic study etc.). Thirdly, the sample size
depends on the amount of uncertainty affecting our investigation (e.g. often quantified
via standard deviation). Fourthly, the null and alternative hypotheses need to be testable
and they will be tested against each other, using an appropriate statistical test. Since
each statistical tests has its own statistical properties this consequently means that, each
statistical test has its own sample size calculation procedure. Some calculations can be
done explicitly (such as for a t-test [40]) some need to be done via simulations.
The simplest sample size calculation is for a t-test which is the simplest of the
group comparisons test. There are differences in the strategy to calculate the sample
size for hypothesis testing (e.g. group comparison, descriptive modeling) vs predic-
tion (e.g. the disease detection for an individual eye, diagnosis, discrimination, clas-
sification into disease groups).
Main points to consider in the sample size calculations:
• First, it is important to consider the study design and the analytical method to
analyze the data. They are the main determinants for the sample size calculation.
Then further determinants can be (if relevant): uncertainty, correlations,
distribution of the data.
• Second, it is crucial to know that the sample size is the number of units of
analysis (e.g. patients, or eyes, or tissues) that we need to recruit for our study.