Page 33 - Statistics II for Dummies
P. 33

Chapter 1: Beyond Number Crunching: The Art and Science of Data Analysis    17


                                Interaction effects can come up in statistical models that use two or more vari-
                                ables to explain or compare outcomes. In this case you can’t automatically
                                study the effect of each variable separately; you have to first examine whether
                                or not an interaction effect is present.

                                For example, suppose medical researchers are studying a new drug for
                                depression and want to know how this drug affects the change in blood pres-
                                sure for a low dose versus a high dose. They also compare the effects for
                                children versus adults. It could also be that dosage level affects the blood
                                pressure of adults differently than the blood pressure of children. This type
                                of model is called a two-way ANOVA model, with a possible interaction effect
                                between the two factors (age group and dosage level). Chapter 11 covers this
                                subject in depth.


                                Correlation


                                The term correlation is often misused. Statistically speaking, the correlation
                                measures the strength and direction of the linear relationship between two
                                quantitative variables (variables that represent counts or measurements
                                only).
                                You aren’t supposed to use correlation to talk about relationships unless the
                                variables are quantitative. For example, it’s wrong to say that a correlation
                                exists between eye color and hair color. (In Chapter 14, you explore associa-
                                tions between two categorical variables.)

                                Correlation is a number between –1.0 and +1.0. A correlation of +1 indicates
                                a perfect positive relationship; as you increase one variable, the other one
                                increases in perfect sync. A correlation of –1.0 indicates a perfect negative
                                relationship between the variables; as one variable increases, the other one
                                decreases in perfect sync. A correlation of zero means you found no linear
                                relationship at all between the variables. Most correlations in the real world
                                fall somewhere in between –1.0 and +1.0; the closer to –1.0 or +1.0, the stron-
                                ger the relationship is; the closer to 0, the weaker the relationship is.

                                Figure 1-1 shows a plot of the number of coffees sold at football games in
                                Buffalo, New York, as well as the air temperature (in degrees Fahrenheit) at
                                each game. This data set seems to follow a downhill straight line fairly well,
                                indicating a negative correlation. The correlation turns out to be –0.741;
                                number of coffees sold has a fairly strong negative relationship with the tem-
                                perature of the football game. This makes sense because on days when the
                                temperature is low, people get cold and want more coffee. I discuss correla-
                                tion further, as it applies to model building, in Chapter 4.













          05_466469-ch01.indd   17                                                                    7/24/09   9:30:48 AM
   28   29   30   31   32   33   34   35   36   37   38