Page 75 - Intermediate Statistics for Dummies
P. 75
07_045206 ch03.qxd 2/1/07 9:46 AM Page 54
54
Part I: Data Analysis and Model-Building Basics
customers are constantly using the gas pumps, so you basically have no time
between customers, and that model holds day after day. At gas station #2,
customers sometimes come all at once, and sometimes you don’t see a
single person for an hour or more. So the time between customers varies
quite a bit.
For which gas station would it be easier to estimate the overall average time
between customers as a whole? Gas station #1 has much more consistency,
which represents a smaller standard deviation of times between customers.
Gas station #2 has much more heterogeneity of times between customers,
so that one is harder to get a handle on. That means σ for gas station #1 is
smaller than σ for gas station #2.
Sample size and margin of error
Sample size affects margin of error in a very intuitive way. Suppose you’re
trying to estimate the average number of pets per household in your city.
Which sample size would give you better information: 10 homes or 100
homes? You’d agree that 100 homes would give more precise information
(as long as the data on those 100 homes was collected properly).
If you have more data to base your conclusions on, and that data is collected
properly, your results will be more precise. Precision is measured by margin
of error; so as the sample size increases, the margin of error of your estimate
goes down. That’s why you typically see an n (sample size) in the denomina-
tor of margin of error formulas. In the formula for the margin of error of the
sample mean, you can see n in the denominator.
Bigger is only better in terms of sample size if the data is collected properly.
That is, you should find no bias in the way the members of the sample were
selected or in the way the data was collected on those subjects. If the quality
of the data can’t be maintained with a larger sample size, it does no good to
have it.
Confidence level and margin of error
The amount of confidence you need to have differs from problem to problem.
Suppose you’re estimating the mean weight that an elevator can hold. You
would want to be pretty confident about your results, right? But, if you
wanted to estimate the percentage of females that may come to your party on
Saturday night, you may not need to be so confident (despite the desperation
you see in your single buddies’ eyes). For each problem at hand, you have to
address how confident you need to be in your results over the long term,
and, of course, more confidence comes with a price in the margin of error for-
mula. This level of confidence in your results over the long term is reflected
in a number called the confidence level, reported as a percentage. In general,
more confidence requires a wider range of likely values. Ninety-five percent
is the most common confidence level statisticians use.