Page 207 - Computational Statistics Handbook with MATLAB
P. 207

194                        Computational Statistics Handbook with MATLAB


                             them to travel to work. He uses the sample mean to help determine whether
                             there is sufficient evidence to reject the null hypothesis and conclude that the
                             mean travel time has increased. The sample mean that he calculates is 47.2
                             minutes. This is slightly higher than the mean of 45 minutes for the null
                             hypothesis. However, the sample mean is a random variable and has some
                             variation associated with it. If the variance of the sample mean under the null
                             hypothesis is large, then the observed value of  x =  47.2   minutes might not
                                                  . This is explained further in Example 6.1.
                             be inconsistent with H 0

                             Example 6.1
                             We continue with the transportation example. We need to determine whether
                             or not the value of the statistic obtained from a random sample drawn from
                             the population is consistent with the null hypothesis. Here we have a random
                             sample comprised of  n =  100   commute times. The sample mean of these
                             observations is x =  47.2   minutes. If the transportation official assumes that
                             the travel times to work are normally distributed with σ =  15   minutes (one
                                                            σ
                             might know a reasonable value for   based on previous experience with the
                                                                       x
                             population), then we know from Chapter 3 that   is approximately normally
                                                    and standard deviation σ  σ X ⁄  n  . Standardiz-
                             distributed with mean µ X                      =
                                                                          X
                             ing the observed value of the sample mean, we have
                                              x –  µ 0  x –  µ 0  47.2 –  45  2.2
                                         z o =  ------------------ =  -------------- =  ----------------------- =  ------- =  1.47  ,  (6.1)
                                              σ X ⁄  n  σ  X   15 ⁄  100  1.5

                                                                                 is the mean under
                             where z o   is the observed value of the test statistic, and µ 0
                             the null hypothesis. Thus, we have that the value of x =  47.2   minutes is 1.47
                             standard deviations away from the mean, if the null hypothesis is really true.
                                                  in Equation 6.1.) We know that approximately 95% of
                             (This is why we use µ 0
                             normally distributed random variables fall within two standard deviations
                             either side of the mean. Thus,  x =  47.2   minutes is not inconsistent with the
                             null hypothesis.

                              In hypothesis testing, the rule that governs our decision might be of the
                             form: if the observed statistic is within some region, then we reject the null hypoth-
                             esis. The critical region is an interval for the test statistic over which we
                                           . This is sometimes called the rejection region. The critical
                             would reject  H 0
                             value is that value of the test statistic that divides the domain of the test sta-
                                                                                          will be
                             tistic into a region where  H 0   will be rejected and one where  H 0
                             accepted. We need to know the distribution of the test statistic under the null
                             hypothesis to find the critical value(s).
                              The critical region depends on the distribution of the statistic under the
                             null hypothesis, the alternative hypothesis, and the amount of error we are
                             willing to tolerate. Typically, the critical regions are areas in the tails of the
                                                                 is true. It could be in the lower tail,
                             distribution of the test statistic when H 0


                             © 2002 by Chapman & Hall/CRC
   202   203   204   205   206   207   208   209   210   211   212