Page 44 - Statistics II for Dummies
P. 44

28       Part I: Tackling Data Analysis and Model-Building Basics



                                There is no rule of thumb regarding how large or small the margin of error
                                should be for a quantitative variable; it depends on what the variable is
                                counting or measuring. For example, if you want average household income
                                for the state of New York, a margin of error of plus or minus $5,000 is not
                                unreasonable. If the variable is the average number of steps from the first
                                floor to the second floor of a two-story home in the U.S., the margin of error
                                will be much smaller. Estimates of categorical variables, on the other hand,
                                are percentages; most people want those confidence intervals to be within
                                plus or minus 2 to 3 percent.


                                Making comparisons


                                Suppose you want to look at income (a quantitative variable) and how it
                                relates to a categorical variable, such as gender or region of the country.
                                Your first question may be: Do males still make more money than females?
                                In this case, you can compare the mean incomes of two populations — males
                                and females. This assessment requires a hypothesis test of two means (often
                                called a t-test for independent samples). I present more information on this
                                technique in Chapter 3.

                                When comparing the means of more than two groups, don’t simply look at all
                                the possible t-tests that you can do on the pairs of means because you have to
                                control for an overall error rate in your analysis. Too many analyses can result
                                in errors — adding up to disaster. For example, if you conduct 100 hypothesis
                                tests, each one with a 5 percent error rate, then 5 of those 100 tests will come
                                out statistically significant on average, just by chance, even if no real relation-
                                ship exists.

                                If you want to compare the average wage in different regions of the country
                                (the East, the Midwest, the South, and the West, for example), this compari-
                                son requires a more sophisticated analysis because you’re looking at four
                                groups rather than just two. The procedure for comparing more than two
                                means is called analysis of variance (ANOVA, for short), and I discuss this
                                method in detail in Chapters 9 and 10.


                                Exploring relationships


                                One of the most common reasons data is collected is to look for relationships
                                between variables. With quantitative variables, the most common type of
                                relationship people look for is a linear relationship; that is, as one variable
                                increases, does the other increase/decrease along with it in a similar way?
                                Relationships between any variables are examined using specialized plots
                                and statistics. Since a linear relationship is so common, it has its own special
                                statistic called correlation. You find out how statisticians make graphs and
                                statistics to explore relationships in this section, paying particular attention
                                to linear relationships.






          06_466469-ch02.indd   28                                                                    7/24/09   9:31:38 AM
   39   40   41   42   43   44   45   46   47   48   49