Page 107 - Intermediate Statistics for Dummies
P. 107
09_045206 ch04.qxd 2/1/07 9:49 AM Page 86
86
Part II: Making Predictions by Using Regression
Second, choose only reasonable values of x for which you try to make esti-
mates about y. That is, look at the values of x for which your data was collected
and stay within those bounds when making predictions. In the textbook weight
example, the smallest average student weight is 48.5 pounds, and the largest
average student weight is 142 pounds. Choosing student weights between 48.5
and 142 to plug in for x in the equation is okay, but choosing values less than
48.5 or above 142 isn’t a good idea. You can’t guarantee that the same linear
relationship (or any linear relationship for that matter) continues outside the
given boundaries.
Think about it: If the relationship you found actually continued for any value
of x, no matter how large, then a 250-pound linebacker from OSU would have
to carry 3.69 + 0.113 250 = 31.94 pounds of books around in his backpack.
*
Of course this would be easy for him, but what about the rest of us?
Knowing the limitations of a simple
linear regression model
A simple linear regression model is just what it says it is: simple. I don’t
mean easy to work with, necessarily, but simple in the uncluttered sense. The
model tries to estimate the value of y by only using one variable, x. However,
the number of real-world situations that can be explained by using a simple,
one-variable linear regression is small. Oftentimes one variable just can’t do
all the predicting.
If one variable alone doesn’t result in a model that fits, add more variables.
Oftentimes it takes many variables to make a good estimate for y. In the
case of stock market prices, they’re still looking for that ultimate prediction
model.
As another example, health insurance companies try to estimate how long
you will live by asking you a series of questions (each of which represents a
variable in the regression model). You can’t find one single variable that esti-
mates how long you’ll live; you must consider many factors: your health,
your weight, whether or not you smoke, genetic factors, how much exercise
you do each week, and the list goes on and on and on.
The point is, regression models don’t always use just one variable, x, to esti-
mate y. Some models use two, three, or even more variables to estimate y.
Those models aren’t called simple linear regression models; they’re called
multiple linear regression models, because of their employment of multiple
variables to make an estimate. (You can explore multiple linear regression
models in Chapter 5.)
@Spy