Page 190 - Computational Retinal Image Analysis
P. 190

4  On choosing the right statistical analysis method  185




                     What is the response of the statistician to the question of choosing the right sta-
                  tistical method? A good statistician will respond with clarifying questions: “Can you
                  give me some clinical background? What is the goal of your research study and do
                  you have a hypothesis?” Here the statistician will aim to find out if your study is
                  exploratory (i.e. hypothesis generating), confirmatory (i.e. inferential or hypothesis
                  testing), diagnostic (including prognostic or predictive). Then the statistician will
                  follow with questions on how you designed the study and how you collected the
                  data. Ideally, however, a statistician would be part of the study already and would
                  have been involved in the decision making process when the study design was being
                  developed, and hence the statistician would not have to ask all these questions.


                  4.3  Words of caution in the data analysis method selection

                  Visualizing data is very important and underrated. In his famous Exploratory
                  Data Analysis, Tukey  [16] wrote: “The greatest value of a picture is when it
                  forces us to notice what we never expected to see.” A misconception is that we
                  do not need the exploratory analysis if we are doing a confirmatory (inference)
                  study. Exploratory analysis (such as examining means and medians, histograms,
                  piecharts) are crucial for research.  They are often termed as descriptive data
                  analysis methods (see Table 3). There are several reasons why we need them.
                  They help us to understand data, check for outliers, any expected or unexpected
                  patterns. They help us to create the demographics tables and summaries for the
                  reports. They help to verify the distribution of the data so that we can make in-
                  formed decision about the data analysis selection (Section 4). Furthermore, when
                  using a complex data analysis methods (e.g. adjusted logistic regression), it is
                  essential to understand the way the results agree, or disagree, with those from
                  simpler methods (e.g. unadjusted logistic regression). Therefore, when we write
                  a research report, we are obliged to include both: the results from the simple data
                  analysis methods as well as the complex methods, so that reviewers can judge the
                  consistency between the results.
                     Sophisticated statistical analytic techniques are rarely able to compensate for
                  deficiencies in data collection. A common misconception is that a flaw in the data
                  collection, in study design can be adjusted for via a complex fancy statistical data
                  analysis method. It is indeed true that an alternative data analysis technique may
                  be able to help avoid some difficulty, such as by adjusting for confounders when
                  analyzing data from observational study rather than controlling for confounders via
                  randomization or careful selection of subjects. However, there are many scenarios
                  where a complex statistical method will not help to rectify the flaws. For example
                  in a study of association between intraocular pressure and diabetes a potential con-
                  founder is systolic blood pressure. If we collect this confounder as a dichotomous
                  data (or if we do not collect it at all) then the estimated association between diabetes
                  and intraocular pressure will be underpowered and/or biased [17].
                     Often case control studies involve matching at the study design stage so as to
                  make comparator groups more similar to each other. It is important to note that if this
   185   186   187   188   189   190   191   192   193   194   195