Page 185 - Computational Retinal Image Analysis
P. 185

180    CHAPTER 10  Statistics in ophthalmology




                         A solution would be to quantify the uncertainty in VA in the presence of factors such
                         as age, treatment, diet etc. and see which of them are relevant. This brings us to the
                         third uncertainty. We may be uncertain about the data analysis itself. The uncertainty
                         in the data analytical process involves numerical data obtained on patients or eyes,
                         it can involve expert judgment (e.g. in Bayesian analysis), it involves assumptions
                         (e.g. normality) that can be verified but seldom known exactly (more in Section 3.2).
                            To quantify uncertainties the statistics relies on theory of probability. Probability
                         is a rich and beautiful subject, a scientific discipline unto itself. Here, however, we
                         assume that the reader is knowledgeable in the main concepts of probability and
                         how they are used in statistics, as this is taught in most of statistical, data science
                         and machine learning courses. One excellent reference is [3] which is suitable for
                         graduate students of computer science and honors undergraduates in math, statistics,
                         and computer science as well as graduate students of statistics who need to learn
                         mathematical statistics. The book teaches probability, the formal language of uncer-
                         tainty, as well as showing how they are used in the process of learning from data,
                         about statistics.


                         3.2  The problem of estimation, P-values and confidence intervals
                         As we mentioned above in the statistics we aim to quantify the uncertainty. We as-
                         sume that we have a population of patients that we are studying, we choose a sample
                         from the population (often as a random sample of several patients or eyes), collect
                         measurements on the patients or eyes to build the data. Then using the data we do
                         estimation, and from this estimation, we do inference about the whole population.
                         This process is called the concept of statistical inference and it follows the essence
                         of inductive reasoning, see e.g. Ref. [3].
                            In quantifying the uncertainty, most of the time we use methods that involve free
                         parameters. For example linear regression model has the form
                                                    Y = β 0  + β 1 X + ε .
                                                                i
                                                             i
                                                     i
                            The parameters β 0 , β 1  need to be estimated, in other words the model is fitted
                         to the data. Following a convention in the statistical literature, we use θ to denote a
                         generic parameter vector. In our discussion we focus on the case of a single, scalar
                         parameter, but in most real-world problems θ becomes a vector, e.g. θ = (β 0 , β 1 ). The
                         problem of parameter estimation is to determine a method of estimating θ from the
                         data. To constitute a well-defined estimation method we must have an explicit proce-
                         dure, that is, a formula or a rule by which a set of data values x 1 , x 2 , … , x n  produces
                         an estimate of θ. We consider an estimator of θ to have the form T = T(X 1 , X 2 ,… , X n ),
                         i.e., the estimator is a random variable derived from the random sample, X 1 , X 2 , … ,
                         X n . The properties of an estimator may be described in terms of its probabilistic
                         behavior.
                            It is important to make two comments on statistical notation for estimation. First,
                         when we write T = T(X 1 , X 2 ,… , X n ) we are using capital letters to indicate clearly that
                         we are considering the estimator, T, to be a random variable, and the  terminology
   180   181   182   183   184   185   186   187   188   189   190