Page 152 - Intermediate Statistics for Dummies
P. 152
12_045206 ch07.qxd 2/1/07 9:54 AM Page 131
Chapter 7: When Data Throws You a Curve: Using Nonlinear Regression
relationship between x and y. However in this case, the correlation is mislead-
ing, because the scatterplot appears to be curved. As with any regression
analysis, taking into account both the scatterplot and the correlation when
making a decision about how well the model being considered would fit the
data is very important. The contradiction in this example between the scat-
terplot and the correlation is a red flag telling you that a straight-line model
isn’t the best idea.
The correlation coefficient measures only the strength and direction of the
linear relationship between x and y (see Chapter 4). However, you may run
into situations (like the one shown in Figure 7-1) where a correlation can be
strong, yet the scatterplot shows a curve would fit better. Don’t rely solely
on either the scatterplot or the correlation coefficient alone to make your
decision about whether to go ahead and fit a straight line to your data.
The bottom line here is that fitting a line to data that appears to have a curved
pattern isn’t the way to go. What you need to do in this situation is explore
models that have curved patterns themselves. In the following sections, you 131
see two major types of nonlinear (or curved) models that are used to model
curved data: polynomials (beyond a straight line) and exponential models
(that start out small and quickly increase, or the other way around). Because
the pattern of the data in Figure 7-1 starts low and bends upward, the correct
model to fit this data is an exponential regression model. (This model would
also be appropriate for data that starts out high and bends down low.)
Handling Curves in the Road
with Polynomials
One major family of nonlinear models is the polynomial family. You use these
models when a polynomial function (beyond a straight line) best describes
the curve in the data. (For example, the data may follow the shape of a
parabola, which is a second-degree polynomial.) You typically use polynomial
models when the data follow a pattern of curves going up and down a certain
number of times. For example, suppose a doctor examines the occurrence of
heart problems in patients as it relates to their blood pressure. She finds that
patients with very low or very high blood pressure had a higher occurrence
of problems, while patients whose blood pressure fell in the middle, consti-
tuting the normal range, had fewer problems. This pattern of data has a
U-shape, and a parabola would fit this data well.
In this section, you see what a polynomial regression model is, how you can
search for a good-fitting polynomial for your data, and how you can assess
polynomial models.