Page 185 - Computational Retinal Image Analysis
P. 185
180 CHAPTER 10 Statistics in ophthalmology
A solution would be to quantify the uncertainty in VA in the presence of factors such
as age, treatment, diet etc. and see which of them are relevant. This brings us to the
third uncertainty. We may be uncertain about the data analysis itself. The uncertainty
in the data analytical process involves numerical data obtained on patients or eyes,
it can involve expert judgment (e.g. in Bayesian analysis), it involves assumptions
(e.g. normality) that can be verified but seldom known exactly (more in Section 3.2).
To quantify uncertainties the statistics relies on theory of probability. Probability
is a rich and beautiful subject, a scientific discipline unto itself. Here, however, we
assume that the reader is knowledgeable in the main concepts of probability and
how they are used in statistics, as this is taught in most of statistical, data science
and machine learning courses. One excellent reference is [3] which is suitable for
graduate students of computer science and honors undergraduates in math, statistics,
and computer science as well as graduate students of statistics who need to learn
mathematical statistics. The book teaches probability, the formal language of uncer-
tainty, as well as showing how they are used in the process of learning from data,
about statistics.
3.2 The problem of estimation, P-values and confidence intervals
As we mentioned above in the statistics we aim to quantify the uncertainty. We as-
sume that we have a population of patients that we are studying, we choose a sample
from the population (often as a random sample of several patients or eyes), collect
measurements on the patients or eyes to build the data. Then using the data we do
estimation, and from this estimation, we do inference about the whole population.
This process is called the concept of statistical inference and it follows the essence
of inductive reasoning, see e.g. Ref. [3].
In quantifying the uncertainty, most of the time we use methods that involve free
parameters. For example linear regression model has the form
Y = β 0 + β 1 X + ε .
i
i
i
The parameters β 0 , β 1 need to be estimated, in other words the model is fitted
to the data. Following a convention in the statistical literature, we use θ to denote a
generic parameter vector. In our discussion we focus on the case of a single, scalar
parameter, but in most real-world problems θ becomes a vector, e.g. θ = (β 0 , β 1 ). The
problem of parameter estimation is to determine a method of estimating θ from the
data. To constitute a well-defined estimation method we must have an explicit proce-
dure, that is, a formula or a rule by which a set of data values x 1 , x 2 , … , x n produces
an estimate of θ. We consider an estimator of θ to have the form T = T(X 1 , X 2 ,… , X n ),
i.e., the estimator is a random variable derived from the random sample, X 1 , X 2 , … ,
X n . The properties of an estimator may be described in terms of its probabilistic
behavior.
It is important to make two comments on statistical notation for estimation. First,
when we write T = T(X 1 , X 2 ,… , X n ) we are using capital letters to indicate clearly that
we are considering the estimator, T, to be a random variable, and the terminology