Page 86 - MATLAB Recipes for Earth Sciences
P. 86
78 4 Bivariate Statistics
j_meters(i) = [];
j_age(i) = [];
% Compute regression line from the n-1 data points
p(i,:) = polyfit(j_meters,j_age,1);
% Plot the i-th regression line and hold plot for next loop
plot(meters,polyval(p(i,:),meters),’r’), hold on
% Store the regression result and errors in p_age and p_error
p_age(i) = polyval(p(i,:),meters(i));
p_error(i) = p_age(i) - age(i);
end
The prediction error is – in the best case – gaussian distributed with zero
mean.
mean(p_error)
ans =
0.0122
The standard deviation is an unbiased mean deviation of the true data points
from the predicted straight line.
std(p_error)
ans =
12.4289
Cross validation gives valuable information of the goodness-of-fit of the
regression result. This method can be used also for quality control in other
fields, such as spatial and temporal prediction.
4.9 Reduced Major Axis Regression
In some cases, both variables are not manipulated and can therefore be con-
sidered to be independent. In fact, a number of methods are available to
compute a best-fit line that minimizes the distance from both x and y. As an
example, the method of reduced major axis (RMA) minimizes the triangular
area 0.5*(¨x¨y) between the points and the regression line, where ¨x and
¨y are the distances between predicted and true x and y values (Fig. 4.4).
This optimization appears to be complex. However, it can be shown that the
first regression coeffi cient b (the slope) is simply the ratio of the standard
1
deviations of x and y.