Page 207 - Computational Statistics Handbook with MATLAB

P. 207

194 Computational Statistics Handbook with MATLAB

them to travel to work. He uses the sample mean to help determine whether
there is sufficient evidence to reject the null hypothesis and conclude that the
mean travel time has increased. The sample mean that he calculates is 47.2
minutes. This is slightly higher than the mean of 45 minutes for the null
hypothesis. However, the sample mean is a random variable and has some
variation associated with it. If the variance of the sample mean under the null
hypothesis is large, then the observed value of x = 47.2 minutes might not
. This is explained further in Example 6.1.
be inconsistent with H 0

Example 6.1
We continue with the transportation example. We need to determine whether
or not the value of the statistic obtained from a random sample drawn from
the population is consistent with the null hypothesis. Here we have a random
sample comprised of n = 100 commute times. The sample mean of these
observations is x = 47.2 minutes. If the transportation official assumes that
the travel times to work are normally distributed with σ = 15 minutes (one
σ
might know a reasonable value for based on previous experience with the
x
population), then we know from Chapter 3 that is approximately normally
and standard deviation σ σ X ⁄ n . Standardiz-
distributed with mean µ X =
X
ing the observed value of the sample mean, we have
x – µ 0 x – µ 0 47.2 – 45 2.2
z o = ------------------ = -------------- = ----------------------- = ------- = 1.47 , (6.1)
σ X ⁄ n σ X 15 ⁄ 100 1.5

is the mean under
where z o is the observed value of the test statistic, and µ 0
the null hypothesis. Thus, we have that the value of x = 47.2 minutes is 1.47
standard deviations away from the mean, if the null hypothesis is really true.
in Equation 6.1.) We know that approximately 95% of
(This is why we use µ 0
normally distributed random variables fall within two standard deviations
either side of the mean. Thus, x = 47.2 minutes is not inconsistent with the
null hypothesis.

In hypothesis testing, the rule that governs our decision might be of the
form: if the observed statistic is within some region, then we reject the null hypoth-
esis. The critical region is an interval for the test statistic over which we
. This is sometimes called the rejection region. The critical
would reject H 0
value is that value of the test statistic that divides the domain of the test sta-
will be
tistic into a region where H 0 will be rejected and one where H 0
accepted. We need to know the distribution of the test statistic under the null
hypothesis to find the critical value(s).
The critical region depends on the distribution of the statistic under the
null hypothesis, the alternative hypothesis, and the amount of error we are
willing to tolerate. Typically, the critical regions are areas in the tails of the
is true. It could be in the lower tail,
distribution of the test statistic when H 0

202 203 204 205 206 207 208 209 210 211 212