Page 155 - Statistics and Data Analysis in Geology

P. 155

Statistics and Data Analysis in Geology- Chapter 6

Once the regression sums of squares of the individual variables have been
calculated, they can be entered into an expanded ANOVA table such as that shown
in Table 6-3 and tested for significance. The F-test ratios are formed from the
mean squares due to partial regression with each of the individual variables in the
numerators, and the mean square due to deviation from the regression model as
the denominator. Each F-test has 1 and (n - m - 1) associated degrees of freedom.
The F-tests will not change if the calculations are based on standardized partial
regression coefficients.

Table 6-3. ANOVA for testing the significance of partial regression
of individual variables.

A complete ANOVA for testing the significance of the partial regression of
each geomorphic variable on basin magnitude is given in Table 6-4. Although
basin relief, basin area, and stream length all have the largest standardized partial
regression coefficients, the contribution to the total regression made by basin area
is not statistically significant. This is because the partial regression coefficient for
basin area has an associated high standard error.
Although the standardized partial regression coefficients provide a guide to
the most effective variables in the regression, they are not an infallible index to the
“best possible’’ regression equation. Suppose you examine the regression equation
and decide two variables are contributing a negligible amount to the regression and
can be discarded. When one of the variables is omitted and the regression is recal-
culated, the goodness of fit and the regression equation, of course, change. Now
suppose you decide to discard the second variable; again the regression changes.
But the change might be quite different from the change that would occur if the
first discarded variable were still in the regression. This occurs because the interac-
tion effects of the two discarded variables with other variables cannot be assessed
without recomputing the regression. If we want to search through a large set of
variables and “weed out” those which are not helpful in the problem, we must do
more than simply examine the partial regression coefficients.

468

150 151 152 153 154 155 156 157 158 159 160