Page 182 - Statistics for Dummies

P. 182

166 Part III: Distributions and the Central Limit Theorem

This result is no coincidence! In general, the mean of the population of all
possible sample means is the same as the mean of the original population.
(Notationally speaking, you write .) It’s a mouthful, but it makes sense
that the average of the averages from all possible samples is the same as
the average of the population that the samples came from. In the die rolling
example, the average of the population of all 50-roll averages equals the aver-
age of the population of all single rolls (3.5).

Using subscripts on , you can distinguish which mean you’re talking
about — the mean of X (all individuals in a population) or the mean of
(all sample means from the population).

Measuring Standard Error

The values in any population deviate from their mean; for instance, people’s
heights differ from the overall average height. Variability in a population of
individuals (X) is measured in standard deviations (see Chapter 5 for details
on standard deviation). Sample means vary because you’re not sampling the
whole population, only a subset; and as samples vary, so will their means.
Variability in the sample mean ( ) is measured in terms of standard errors.
Error here doesn’t mean there’s been a mistake — it means there is a gap
between the population and sample results.

The standard error of the sample mean is denoted by (sigma sub-x-bar). Its
formula is , where is population standard deviation (sigma sub-x) and
n is size of each sample. In the next sections you see the effect each of these
two components has on the standard error.

Sample size and standard error

The first component of standard error is the sample size, n. Because n is in
the denominator of the standard error formula, the standard error decreases
as n increases. It makes sense that having more data gives less variation (and
more precision) in your results.

Suppose X is the time it takes for a clerical worker to type and send one letter of
recommendation, and say X has a normal distribution with mean 10.5 minutes
and standard deviation 3 minutes. The bottom curve in Figure 11-2 shows the
picture of the distribution of X, the individual times for all clerical workers in the
population. According to the Empirical Rule (see Chapter 9), most of the values
are within 3 standard deviations of the mean (10.5) — between 1.5 and 19.5.

17_9780470911082-ch11.indd 166 3/25/11 10:01 PM

177 178 179 180 181 182 183 184 185 186 187