Page 150 - Statistics II for Dummies
P. 150
134 Part II: Using Different Types of Regression to Make Predictions
Step three: Go exponential
After you have your Minitab output, you’re ready for step three. You transform
the model log(y) = –0.19 + 0.28 * x into a model for y by taking 10 to the power
of the left-hand side and 10 to the power of the right-hand side. Transforming
the log(y) equation for the secret-spreading data, you get y = 10 –0.19 + 0.28x .
Step four: Make predictions
By using the exponential model from step three, you can move on to step
four: Make predictions for appropriate values of x (within the range of where
data was collected). Continuing to use the secret-spreading data, suppose
you want to estimate the number of people knowing the secret on day five
(see Figure 7-2). Just plug x = 5 into the exponential model to get
y = 10 –0.19 + 0.28 * 5 = 10 1.21 = 16.22. Looking back at Figure 7-2, you can see that
this estimate falls right in line with the data on the graph.
Step five: Assess the fit of your exponential model
Now that you’ve found the best-fitting exponential model, you have the worst
behind you. You’ve arrived at step five and are ready to further assess the
model fit (beyond the scatterplot of the original data) to make sure no major
problems arise.
In general, to assess the fit of an exponential model, you’re really looking at the
straight-line fit of log(y). Just use these three items (in any order) in the same
way as described in the earlier section “Assessing the fit of a polynomial model”:
✓ Check the scatterplot of the log(y) data to see how well it resembles
a straight line. You assess the fit of the log(y) for the secret-spreading
data first through the scatterplot shown in Figure 7-13. The scatterplot
shows that the model appears to fit the data well, because the points are
scattered in a tight pattern around a straight line.
During this process the data were transformed also. You started with
x and y data, and now you have x and log(y) for your data. You see x, y,
and log(y) for the secret-spreading data in Table 7-2.
2
✓ Examine the value of R adjusted for the model of the best-fitting line
2
for log(y), done by Minitab. The value of R adjusted for this model is
found in Figure 7-13 to be 91.6 percent. This value also indicates a good
fit because it’s very close to 100 percent. Therefore, 91.6 percent of the
variation in the number of people knowing the secret is explained by how
many days it has been since the secret-spreading started. (Makes sense.)
✓ Look at the residual plots from the fit of a line to the log(y) data. The
residual plots from this analysis (see Figure 7-14) show no major depar-
tures from the conditions that the errors are independent and have a
12_466469-ch07.indd 134 7/24/09 9:39:11 AM