Page 57 - Intermediate Statistics for Dummies
P. 57
06_045206 ch02.qxd 2/1/07 9:42 AM Page 36
36
Part I: Data Analysis and Model-Building Basics
straight line and tells you the direction of that line as well. In total, for any
two quantitative variables, x and y, the correlation measures the strength
and direction of their linear relationship. As one increases, what does the
other one do?
Because qualitative variables don’t have a numerical order to them, they
don’t increase or decrease in value. For example, just because male = 1 and
female = 2 doesn’t mean that a female is worth twice a male. (Although some
women may want to disagree.) These numbers represent categories, not
values. Therefore, you can’t use the word correlation to describe the relation-
ship between, say, gender and political affiliation. The appropriate term to
describe the relationships of qualitative variables is association. You can say
that political affiliation is associated with gender, and explain how. (For full
details on association, see Chapter 13. For more information on correlation,
see Chapter 4.)
Building models to make predictions
You can also build models to predict the value of a qualitative variable based
on other related information. In this case, building models is more than a lot
of little plastic pieces and some irritatingly sticky glue. When you build a
model, you look for variables that help explain, estimate, or predict some
response you’re interested in (the variables that do this are called explana-
tory variables). You sort through the explanatory variables and figure out
which ones do the best job of predicting the response, and you put them
together into a type of equation like y = 2x + 4 where x = shoe size and y =
length of your calf. That equation is a model.
For example, what if you want to know which factors or variables can help
you predict someone’s political affiliation? Is a woman without children more
likely to be a Republican or a Democrat? What about a middle-aged man who
proclaims Hinduism as his religion? In order for you to compare these com-
plex relationships, you must build a model to evaluate each group’s impact
on political affiliation (or some other qualitative variable). This kind of model
building is explored more in-depth in Chapter 8, where I discuss the topic of
logistic regression.
Logistic regression builds models to predict the outcome of a qualitative vari-
able, such as political affiliation. If you want to make predictions about a
quantitative variable, such as income, you need to use the standard type of
regression (check out Chapters 4 and 5).