Page 30 - Statistics II for Dummies
P. 30

14       Part I: Tackling Data Analysis and Model-Building Basics



                                Sample statistic


                                Typically you can’t determine population parameters exactly; you can only
                                estimate them. But all is not lost; by taking a sample (a subset of individuals)
                                from the population and studying it, you can come up with a good estimate
                                of the population parameter. A sample statistic is a single number that sum-
                                marizes that subset.

                                For example, in the cellphone scenario from the previous section, you select
                                a sample of teenagers and measure the duration of their cellphone calls over
                                a period of time (or look at their cellphone records if you can gain access
                                legally). You take the average of the cellphone call duration. For example, the
                                average duration of 100 cellphone calls may be 12.2 minutes — this average
                                is a statistic. This particular statistic is called the sample mean because it’s
                                the average value from your sample data.

                                Many different statistics are available to study different characteristics of a
                                sample, such as the proportion, the median, and standard deviation.



                                Confidence interval

                                A confidence interval is a range of likely values for a population parameter. A
                                confidence interval is based on a sample and the statistics that come from
                                that sample. The main reason you want to provide a range of likely values
                                rather than a single number is that sample results vary.
                                For example, suppose you want to estimate the percentage of people who eat
                                chocolate. According to the Simmons Research Bureau, 78 percent of adults
                                reported eating chocolate, and of those, 18 percent admitted eating sweets
                                frequently. What’s missing in these results? These numbers are only from
                                a single sample of people, and those sample results are guaranteed to vary
                                from sample to sample. You need some measure of how much you can expect
                                those results to move if you were to repeat the study.

                                This expected variation in your statistic from sample to sample is measured
                                by the margin of error, which reflects a certain number of standard deviations
                                of your statistic you add and subtract to have a certain confidence in your
                                results (see Chapter 3 for more on margin of error). If the chocolate-eater
                                results were based on 1,000 people, the margin of error would be approxi-
                                mately 3 percent. This means the actual percentage of people who eat choco-
                                late in the entire population is expected to be 78 percent, ± 3 percent (that is,
                                between 75 percent and 81 percent).













          05_466469-ch01.indd   14                                                                    7/24/09   9:30:47 AM
   25   26   27   28   29   30   31   32   33   34   35