Page 73 - Intermediate Statistics for Dummies
P. 73
07_045206 ch03.qxd 2/1/07 9:45 AM Page 52
52
Part I: Data Analysis and Model-Building Basics
In the case of the population mean, you use the sample mean to estimate it.
σ
. In this formula, you can see the
The sample mean has a standard error of
n
population standard deviation (σ) and the sample size (n).
If you think about it though, why would you know the standard deviation of
the population, σ, when you don’t even know the mean (recall that the mean
is what you’re trying to estimate)? To handle this additional unknown, do
what statisticians always do — estimate it and move on. So you estimate σ,
the population standard deviation, using (what else?) the standard deviation
of the sample, denoted by s. So you replace σ by s in the formula for the stan-
dard error of the mean.
To estimate the population mean by using a confidence interval when σ is
J
N
s
K
O
. This formula contains the sample
unknown, you use the formula x !
O
K
t n 1-
n
L
P
standard deviation (s), the sample size (n), and a t-value representing how
many standard errors you want to add and subtract to get the confidence
you need. To get the margin of error for the mean, you see the standard error,
s
, is being multiplied by a factor of t. Notice that t has n – 1 as a subscript
n
to indicate which of the myriad t-distributions you use for your confidence
interval. The n – 1 is called degrees of freedom, where n is the sample size.
The value of t in this case represents the number of standard errors you add
and subtract to or from the sample mean to get the confidence you want. If you
want to be 95 percent confident, for example, you add and subtract about two
of those standard errors. If you want to be 99.7 percent confident, you add or
subtract about three of them. (Table A-1 in the Appendix presents the t-distribu-
tion from which you can find t-values for any confidence level you want.)
If you do know the population standard deviation for some reason, you would
certainly use it. In that case, you use the corresponding number from the
Z-distribution (standard normal distribution) in the confidence interval for-
mula. (The Z-distribution from your intro stat book can give you the numbers
you need.) Or if you know σ and have a large sample size, you can simply use
the bottom line of the t-distribution, because a t-distribution with a large
number of degrees of freedom gives very similar values to the Z-distribution.
For the MP3 player example from the preceding section, a random sample of
1,000 OSU students spends an average of 2.5 hours using their MP3 players to
listen to music. The standard deviation is 0.5 hours. Plugging this information
J N
into the formula for a confidence interval, you get 2.5 ± 1.96 K . 05 O = 2.5 ±
K , 1 000 O
L P
0.03 hours. You can conclude that OSU MP3-player owners spent an average
of between 2.47 and 2.53 hours listening to music on their players. (The value
for t in this example came from the last line of Table A-1 in the Appendix,
because this line represents the situation where n is large.)