Page 149 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 149
4.4 Inference on Two Populations 129
As a final comment, we draw the reader’s attention to the fact that correlation is
by no means synonymous with causality. As a matter of fact, when two variables
X and Y are correlated, one of the following situations can happen:
– One of the variables is the cause and the other is the effect. For instance, if
X = “nr of forest fires per year” and Y = “area of burnt forest per year”, then
one usually finds that X is correlated with Y, since Y is the effect of X
– Both variables have an indirect cause. For instance, if X = “% of persons daily
arriving at a Hospital with yellow-tainted fingers” and Y = “% of persons daily
arriving at the same Hospital with pulmonary carcinoma”, one finds that X is
correlated with Y, but neither is cause or effect. Instead, there is another variable
that is the cause of both − volume of inhaled tobacco smoke.
– The correlation is fortuitous and there is no causal link. For instance, one may
eventually find a correlation between X = “% of persons with blue eyes per
household” and Y = “% of persons preferring radio to TV per household”. It
would, however, be meaningless to infer causality between the two variables.
4.4.2 Comparing Two Variances
4.4.2.1 The F Test
In some comparison problems to be described later, one needs to decide whether or
2
2
not two independent data samples A and B, with sample variances s and s and
B
A
sample sizes n A and n B, were obtained from normally distributed populations with
the same variance.
Using Property 6 of B.2.9, we know that:
s 2 A /σ 2 A ~ F . 4.7
s B 2 /σ 2 B n A , 1 − n B 1 −
2
“
Under the null hypothesis H 0: σ 2 A = σ ” , we then use the test statistic:
B
F * = s 2 A / s B 2 ~ F n A , 1 − n B 1 − . 4.8
Note that given the asymmetry of the F distribution, one needs to compute the
two (1−α/2)-percentiles of F for a two-tailed test, and reject the null hypothesis if
the observed F value is unusually large or unusually small. Note also that for
applying the F test it is not necessary to assume that the populations have equal
means.
Example 4.6
Q: Consider the two independent samples shown in Table 4.4 of normally
distributed random variables. Test whether or not one should reject at a 5%