Page 187 - Statistics for Dummies
P. 187

171
                                         Chapter 11: Sampling Distributions and the Central Limit Theorem



 Case 1: The distribution of X is normal  Averaging a fair die is approximately normal
                                    Consider the die rolling example from the earlier section “Defining a Sampling
 If X has a normal distribution, then   does too, no matter what the sample   Distribution.” Notice in Figure 11-1a, the distribution of X (the population of
 size n is. In the example regarding the amount of time (X) for a clerical   outcomes based on millions of single rolls) is flat; the individual outcomes of
 worker to complete a task (refer to the section “Sample size and standard   each roll go from 1 to 6, and each outcome is equally likely.
 error”), you knew X had a normal distribution (refer to the lowest curve in
 Figure 11-2). If you refer to the other curves in Figure 11-2, you see the aver-  Things change when you look at averages. When you roll a die a large number
 age times for samples of n = 10 and n = 50 clerical workers, respectively, also   of times (say a sample of 50 times) and look at your outcomes, you’ll prob-
 have normal distributions.         ably find about the same number of 6s as 1s (note that 6 and 1 average out
                                    to 3.5); 5s as 2s (5 and 2 also average out to 3.5); and 4s as 3s (which also
   When X has a normal distribution, the sample means also always have a   average out to 3.5 — do you see a pattern here?). So if you roll a die 50 times,
 normal distribution, no matter what size samples you take, even if you take   you have a high probability of getting an overall average that’s close to 3.5.
 samples of only 2 clerical workers at a time.  Sometimes just by chance things won’t even out as well, but that won’t happen
                                    very often with 50 rolls.
 The difference between the curves in Figure 11-2 is not their means or their
 shapes, but rather their amount of variability (how close the values in the   Getting an average at the extremes with 50 rolls is a very rare event. To get
 distribution are to the mean). Results based on large samples vary less and   an average of 1 on 50 rolls, you need all 50 rolls to be 1. How likely is that? (If
 will be more concentrated around the mean than results from small samples   it happens to you, buy a lottery ticket right away, it’s the luckiest day of your
 or results from the individuals in the population.  life!) The same is true for getting an average near 6.
                                    So the chance that your average of 50 rolls is close to the middle (3.5) is
 Case 2: The distribution of X is not normal —    highest, and the chance of it being at or close to the extremes (1 or 6) is
                                    extremely low. As for averages between 1 and 6, the probabilities get smaller
 enter the Central Limit Theorem    as you move farther from 3.5, and the probabilities get larger as you move
                                    closer to 3.5; in particular, statisticians show that the shape of the sampling
                                    distribution of sample means in Figure 11-1b is approximately normal as long
 If X has any distribution that is not normal, or if its distribution is unknown,   as the sample size is large enough. (See Chapter 9 for particulars on the shape
 you can’t automatically say the sample mean ( ) has a normal distribution.   of the normal distribution.)
 But incredibly, you can use a normal distribution to approximate the distribu-
 tion of   — if the sample size is large enough. This momentous result is due   Note that if you roll the die even more times, the chance of the average being
 to what statisticians know and love as the Central Limit Theorem.  close to 3.5 increases, and the sampling distribution of the sample means
                                    looks more and more like a normal distribution.
    The Central Limit Theorem (abbreviated CLT) says that if X does not have a
 normal distribution (or its distribution is unknown and hence can’t be deemed   Averaging an unfair die is still approximately normal
 to be normal), the shape of the sampling distribution of   is approximately
 normal, as long as the sample size, n, is large enough. That is, you get an   However, sometimes the values of X don’t occur with equal probability like
 approximate normal distribution for the means of large samples, even if the   they do when you roll a fair die. What happens then? For example, say the
 distribution of the original values (X) is not normal.  die isn’t fair, and the average value for many individual rolls turns out to be
                                    2 instead of 3.5. This means the distribution of X is skewed right (more low
                                    values like 1, 2, and 3, and fewer high values like 4, 5, and 6). But if the distri-
 	 Most statisticians agree that if n is at least 30, this approximation will be rea-
 sonably close in most cases, although different distribution shapes for X have   bution of X (millions of individual rolls of this unfair die) is skewed right, how
 different values of n that are needed. The larger the sample size (n), the closer   does the distribution of   (average of 50 rolls of this unfair die) end up with
 the distribution of the sample means will be to a normal distribution.  an approximate normal distribution?















              17_9780470911082-ch11.indd   171                                                             3/25/11   10:01 PM
   182   183   184   185   186   187   188   189   190   191   192