Page 152 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 152
132 4 Parametric Tests of Hypotheses
4.4.3 Comparing Two Means
4.4.3.1 Independent Samples and Paired Samples
Deciding whether two samples came from normally distributed populations with
the same or with different means, is an often-met requirement in many data
analysis tasks. The test is formalised as:
H 0: µ Α = µ Β (or µ Α – µ Β = 0, whence the name “null hypothesis”),
H 1: µ Α ≠ µ Β , for a two-sided test;
H 0: µ Α ≤ µ Β, H 1: µ Α > µ Β , or
H 0: µ Α ≥ µ Β, H 1: µ Α < µ Β , for a one-sided test.
In tests of hypotheses involving two or more samples one must first clarify if the
samples are independent or paired, since this will radically influence the methods
used.
Imagine that two measurement devices, A and B, performed repeated and
normally distributed measurements on the same object:
x 1, x 2, …, x n with device A;
y 1, y 2, …, y n, with device B.
The sets x = [x 1 x 2 … x n]’ and y = [ y 1 y 2 … y n]’, constitute independent samples
generated according to N µ A ,σ and N µ B ,σ , respectively. Assuming that device B
A
B
introduces a systematic deviation ∆, i.e., µ B = µ A + ∆, our statistical model has 4
parameters: µ A, ∆, σ A and σ B.
Now imagine that the n measurements were performed by A and B on a set of n
different objects. We have a radically different situation, since now we must take
into account the differences among the objects together with the systematic
deviation ∆. For instance, the measurement of the object x i is described in
probabilistic terms by N when measured by A and by N when
A ,σ
A +
µ i A µ i ∆ ,σ B
measured by B. The statistical model now has n + 3 parameters: µ A1, µ A2, …, µ An,
∆, σ A and σ B. The first n parameters reflect, of course, the differences among the n
objects. Since our interest is the systematic deviation ∆, we apply the following
trick. We compute the paired differences: d 1 = y 1 – x 1, d 2 = y 2 – x 2, …, d n = y n – x n.
In this paired samples approach, we now may consider the measurements d i as
values of a random variable, D, described in probabilistic terms by N σ , ∆ .
Therefore, the statistical model has now only two parameters. D
The measurement device example we have been describing is a simple one,
since the objects are assumed to be characterised by only one variable. Often the
situation is more complex because several variables − known as factors, effects or
grouping variables − influence the objects. The central idea in the “independent
samples” study is that the cases are randomly drawn such that all the factors,
except the one we are interested in, average out. For the “paired samples” study