Page 135 - Statistics II for Dummies

P. 135

Chapter 7: Getting Ahead of the Learning Curve with Nonlinear Regression 119

The correlation coefficient measures only the strength and direction of the
linear relationship between x and y (see Chapter 4). However, you may run
into situations (like the one shown in Figure 7-2) where a correlation is strong,
yet the scatterplot shows a curve would fit better. Don’t rely solely on either
the scatterplot or the correlation coefficient alone to make your decision
about whether to go ahead and fit a straight line to your data.

The bottom line here is that fitting a line to data that appear to have a curved
pattern isn’t the way to go. Instead, explore models that have curved pat-
terns themselves.
The following sections address two major types of nonlinear (or curved)
models that are used to model curved data: polynomials (that are not
straight lines — that is, curves like quadratics or cubics), and exponential
models (that start out small and quickly increase, or the other way around).
Because the pattern of the data in Figure 7-2 starts low and bends upward,
the correct model to fit this data is an exponential regression model. (This
model is also appropriate for data that start out high and bend down low.)

Handling Curves in the Road

with Polynomials

One major family of nonlinear models is the polynomial family. You use these
models when a polynomial function (beyond a straight line) best describes
the curve in the data. (For example, the data may follow the shape of a parabola,
which is a second-degree polynomial.) You typically use polynomial models when
the data follow a pattern of curves going up and down a certain number of times.
For example, suppose a doctor examines the occurrence of heart problems in
patients as it relates to their blood pressure. She finds that patients with very
low or very high blood pressure had a higher occurrence of problems, while
patients whose blood pressure fell in the middle, constituting the normal
range, had fewer problems. This pattern of data has a U-shape, and a para-
bola would fit this data well.

In this section, you see what a polynomial regression model is, how you can
search for a good-fitting polynomial for your data, and how you can assess
polynomial models.

Bringing back polynomials

You may recall from algebra that a polynomial is a sum of x terms raised
to a variety of powers, and each x is preceded by a constant called the
2
3
coefficient of that term. For example, the model y = 2x + 3x + 6x is a

12_466469-ch07.indd 119 7/24/09 9:39:08 AM

130 131 132 133 134 135 136 137 138 139 140