Page 150 - Statistics II for Dummies
P. 150

134        Part II: Using Different Types of Regression to Make Predictions



                                Step three: Go exponential
                                After you have your Minitab output, you’re ready for step three. You transform
                                the model log(y) = –0.19 + 0.28 * x into a model for y by taking 10 to the power
                                of the left-hand side and 10 to the power of the right-hand side. Transforming
                                the log(y) equation for the secret-spreading data, you get y = 10 –0.19 + 0.28x .

                                Step four: Make predictions
                                By using the exponential model from step three, you can move on to step
                                four: Make predictions for appropriate values of x (within the range of where
                                data was collected). Continuing to use the secret-spreading data, suppose
                                you want to estimate the number of people knowing the secret on day five
                                (see Figure 7-2). Just plug x = 5 into the exponential model to get
                                y = 10 –0.19 + 0.28 * 5  = 10 1.21  = 16.22. Looking back at Figure 7-2, you can see that
                                this estimate falls right in line with the data on the graph.

                                Step five: Assess the fit of your exponential model

                                Now that you’ve found the best-fitting exponential model, you have the worst
                                behind you. You’ve arrived at step five and are ready to further assess the
                                model fit (beyond the scatterplot of the original data) to make sure no major
                                problems arise.
                                In general, to assess the fit of an exponential model, you’re really looking at the
                                straight-line fit of log(y). Just use these three items (in any order) in the same
                                way as described in the earlier section “Assessing the fit of a polynomial model”:

                                  ✓ Check the scatterplot of the log(y) data to see how well it resembles
                                    a straight line. You assess the fit of the log(y) for the secret-spreading
                                    data first through the scatterplot shown in Figure 7-13. The scatterplot
                                    shows that the model appears to fit the data well, because the points are
                                    scattered in a tight pattern around a straight line.

                                     During this process the data were transformed also. You started with
                                    x and y data, and now you have x and log(y) for your data. You see x, y,
                                    and log(y) for the secret-spreading data in Table 7-2.
                                                         2
                                  ✓ Examine the value of R  adjusted for the model of the best-fitting line
                                                                         2
                                    for log(y), done by Minitab. The value of R  adjusted for this model is
                                    found in Figure 7-13 to be 91.6 percent. This value also indicates a good
                                    fit because it’s very close to 100 percent. Therefore, 91.6 percent of the
                                    variation in the number of people knowing the secret is explained by how
                                    many days it has been since the secret-spreading started. (Makes sense.)
                                  ✓ Look at the residual plots from the fit of a line to the log(y) data. The
                                    residual plots from this analysis (see Figure 7-14) show no major depar-
                                    tures from the conditions that the errors are independent and have a












          12_466469-ch07.indd   134                                                                   7/24/09   9:39:11 AM
   145   146   147   148   149   150   151   152   153   154   155