Page 104 -
P. 104

90     CHAPTER 4  Statistical analysis




                           correlated does not necessarily mean that the changes in one variable cause the
                         changes in the other variable. In some cases, there is causal relationship between the
                         two variables. In other cases, there is a hidden variable (also called the “intervening”
                         variable, which is one type of confounding variable) that serves as the underlying
                         cause of the change.
                            For example, in an experiment that studies how users interact with an e- commerce
                         website, you may find a significant correlation between income and performance.
                         More specifically, participants with higher income spend longer time finding a
                         specific item and make more errors during the navigation process. Can you claim
                         that earning a higher income causes people to spend longer time retrieving online
                         items and make more errors? The answer is obviously no. The truth might be that
                         people who earn a higher income tend to be older than those who earn a lower in-
                         come. People in the older age group do not use computers as intensively as in the
                         younger age group, especially when it comes to activities such as online shopping.
                         Consequently, they may spend longer time to find items and make more errors. In
                         this case, age is the intervening variable that is hidden behind the two variables ex-
                         amined in the correlation. Although income and performance are significantly cor-
                         related, there is no causal relationship between them. A correct interpretation of the
                         relationship between the variables is listed in Figure 4.2.


                                               Income
                                    Age
                                               Less experience in       Lower
                                               online purchase          performance
                         FIGURE 4.2
                         Relationship between correlated variables and an intervening variable.



                            This example demonstrates the danger of claiming causal relationship based
                         on significant correlation. In data analysis, it is not uncommon for researchers to
                         conduct pairwise correlation tests on all variables involved and then claim that
                         “variable A has a significant impact on variable B” or “the changes in variable
                         A cause variable B to change,” which can be spurious in many cases. To avoid
                         this mistake, you should keep in mind that empirical studies should be driven
                         by hypothesis, not data. That is, your analysis should be based on a predefined
                         hypothesis, not the other way around. In the earlier example, you are unlikely to
                         develop a hypothesis that “income has a significant impact on online purchas-
                         ing performance” since it does not make much sense. If your study is  hypothesis
                         driven, you will not be fooled by correlation analysis results. On the other hand,
                         if you do not have a clearly defined hypothesis before the study, you will derive
                         hypotheses driven by the data analysis, making it more likely that you will draw
                         false conclusions.
   99   100   101   102   103   104   105   106   107   108   109