Page 135 - Statistics II for Dummies
P. 135

Chapter 7: Getting Ahead of the Learning Curve with Nonlinear Regression  119


                                The correlation coefficient measures only the strength and direction of the
                                linear relationship between x and y (see Chapter 4). However, you may run
                                into situations (like the one shown in Figure 7-2) where a correlation is strong,
                                yet the scatterplot shows a curve would fit better. Don’t rely solely on either
                                the scatterplot or the correlation coefficient alone to make your decision
                                about whether to go ahead and fit a straight line to your data.

                                The bottom line here is that fitting a line to data that appear to have a curved
                                pattern isn’t the way to go. Instead, explore models that have curved pat-
                                terns themselves.
                                The following sections address two major types of nonlinear (or curved)
                                models that are used to model curved data: polynomials (that are not
                                straight lines — that is, curves like quadratics or cubics), and exponential
                                models (that start out small and quickly increase, or the other way around).
                                Because the pattern of the data in Figure 7-2 starts low and bends upward,
                                the correct model to fit this data is an exponential regression model. (This
                                model is also appropriate for data that start out high and bend down low.)



                      Handling Curves in the Road

                      with Polynomials


                                One major family of nonlinear models is the polynomial family. You use these
                                models when a polynomial function (beyond a straight line) best describes
                                the curve in the data. (For example, the data may follow the shape of a parabola,
                                which is a second-degree polynomial.) You typically use polynomial models when
                                the data follow a pattern of curves going up and down a certain number of times.
                                For example, suppose a doctor examines the occurrence of heart problems in
                                patients as it relates to their blood pressure. She finds that patients with very
                                low or very high blood pressure had a higher occurrence of problems, while
                                patients whose blood pressure fell in the middle, constituting the normal
                                range, had fewer problems. This pattern of data has a U-shape, and a para-
                                bola would fit this data well.

                                In this section, you see what a polynomial regression model is, how you can
                                search for a good-fitting polynomial for your data, and how you can assess
                                polynomial models.


                                Bringing back polynomials


                                You may recall from algebra that a polynomial is a sum of x terms raised
                                to a variety of powers, and each x is preceded by a constant called the
                                                                                  2
                                                                                       3
                                coefficient of that term. For example, the model y = 2x + 3x  + 6x  is a




          12_466469-ch07.indd   119                                                                   7/24/09   9:39:08 AM
   130   131   132   133   134   135   136   137   138   139   140