Page 149 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 149

4.4 Inference on Two Populations   129


              As a final comment, we draw the reader’s attention to the fact that correlation is
           by no means synonymous with causality.  As a matter of fact, when two variables
           X and Y are correlated, one of the following situations can happen:

           –  One  of the variables is the  cause and the  other  is  the  effect.  For  instance,  if
              X = “nr of forest fires per year” and Y = “area of burnt forest per year”, then
              one usually finds that X  is correlated with Y, since Y is the effect of X
           –  Both variables have an indirect cause. For instance, if X = “% of persons daily
              arriving at a Hospital with yellow-tainted fingers” and Y = “% of persons daily
              arriving at the same Hospital with pulmonary carcinoma”, one finds that X is
              correlated with Y, but neither is cause or effect. Instead, there is another variable
              that is the cause of both − volume of inhaled tobacco smoke.

           –  The correlation is fortuitous and there is no causal link. For instance, one may
              eventually find a correlation between  X = “% of persons with blue eyes per
              household” and Y = “% of persons preferring radio to TV per household”. It
              would, however, be meaningless to infer causality between the two variables.


           4.4.2 Comparing Two Variances


           4.4.2.1  The F Test

           In some comparison problems to be described later, one needs to decide whether or
                                                                   2
                                                                          2
           not two independent data samples A and B, with sample variances  s  and  s  and
                                                                          B
                                                                   A
           sample sizes n A and n B, were obtained from normally distributed populations with
           the same variance.
              Using Property 6 of B.2.9, we know that:
              s 2 A  /σ 2 A  ~  F  .                                        4.7
              s B 2  /σ 2 B  n A  , 1 − n B  1 −

                                              2
                                   “
              Under the null hypothesis  H 0: σ 2 A  =  σ ” , we then use the test statistic:
                                             B
              F *  = s 2 A  / s B 2  ~  F n A  , 1 − n B  1 −  .            4.8
              Note that given the asymmetry of the F distribution,  one needs to compute the
           two (1−α/2)-percentiles of F for a two-tailed test, and reject the null hypothesis if
           the observed  F value is  unusually large or  unusually small. Note also that for
           applying the F test it is not necessary to assume that the populations have equal
           means.

           Example 4.6

           Q: Consider the two independent samples shown in  Table  4.4 of normally
           distributed random variables. Test  whether or  not one should  reject at  a  5%
   144   145   146   147   148   149   150   151   152   153   154