Page 189 - Statistics for Dummies

P. 189

173
Chapter 11: Sampling Distributions and the Central Limit Theorem

Say that one person, Bob, is doing 50 rolls. What will the distribution of Bob’s Clarifying three major points about the CLT
outcomes look like? Bob is more likely to get low outcomes (like 1 and 2) and
less likely to get high outcomes (like 5 and 6) — the distribution of Bob’s out- I want to alert you to a few sources of confusion about the Central Limit
comes will be skewed right as well. Theorem before they happen to you:
✓ The CLT is needed only when the distribution of X is not a normal dis-
In fact, because Bob rolled his die a large number of times (50), the distribu-
tion of his individual outcomes has a good chance of matching the distribu- tribution or is unknown. It is not needed if X started out with a normal
tion of X (the outcomes from millions of rolls). However, if Bob had only distribution.
rolled his die a few times (say, 6 times), he would be unlikely to even get ✓ The formulas for the mean and standard error of are not due to
the higher numbers like 5 and 6, and hence his distribution wouldn’t look the CLT. These are just mathematical results that are always true. To
as much like the distribution of X. see these formulas, check out the sections “The Mean of a Sampling
Distribution” and “Measuring Standard Error,” earlier in this chapter.
If you run through the results of each of a million people like Bob who rolled
this unfair die 50 times, each of their million distributions will look very simi- ✓ The n stated in the CLT refers to the size of the sample you take each
lar to each other and very similar to the distribution of X. The more rolls they time, not the number of samples you take. Bob rolling a die 50 times is
make each time, the closer their distributions get to the distribution of X and one sample of size 50, so n = 50. If 10 people do it, you have 10 samples,
to each other. And here is the key: If their distributions of outcomes have a each of size 50, and n is still 50.
similar shape, no matter what that similar shape is, then their averages will
be similar as well. Some people will get higher averages than 2 by chance,
and some will get lower averages by chance, but these types of averages get Finding Probabilities for the Sample Mean
less and less likely the farther you get from 2. This means you’re getting an
approximate normal distribution centered at 2.
After you’ve established through the conditions addressed in case 1 or case
The big deal is, it doesn’t matter if you started out with a skewed distribu- 2 (see the previous sections) that has a normal or approximately normal
tion, or some totally wacky distribution for X. Because each of them had distribution, you’re in luck. The normal distribution is a very friendly distri-
a large sample size (number of rolls), the distributions of each person’s bution that has a table for finding probabilities and anything else you need.
sample results end up looking similar, so their averages will be similar, close For example, you can find probabilities for by converting the -value to a
together, and close to a normal distribution. In fancy lingo, the distribution z-value and finding probabilities using the Z-table (provided in the appendix).
of is approximately normal as long as n is large enough. This is all due to the (See Chapter 9 for all the details on the normal and Z-distributions.)
Central Limit Theorem.
The general conversion formula from -values to z-values is:
In order for the CLT to work when X does not have a normal distribution, each
person needs to roll their die enough times (that is, n must be large enough)
so they have a good chance of getting all possible values of X, especially those
outcomes that won’t occur as often. If n is too small, some folks will not get
the outcomes that have low probabilities and their means will differ from Substituting the appropriate values of the mean and standard error of , the
the rest by more than they should. As a result, when you put all the means conversion formula becomes:
together, they may not congregate around a single value. In the end, the
approximate normal distribution may not show up.

17_9780470911082-ch11.indd 173 3/25/11 10:01 PM

184 185 186 187 188 189 190 191 192 193 194