Page 76 - MATLAB Recipes for Earth Sciences

P. 76

68 4 Bivariate Statistics

Bootstrapping therefore represents a powerful and simple tool for accept-

ing or rejecting our ﬁrst estimate of the correlation coefﬁ cient. The applica-
tion of the above procedure applied to the synthetic sediment data yields a
clear unimodal gaussian distribution of the correlation coefﬁ cients.

corrcoef(meters,age)
ans =
1.0000 0.9342
0.9342 1.0000

rhos1000 = bootstrp(1000,'corrcoef',meters,age);
hist(rhos1000(:,2),30)

Most rhos1000 fall within the interval between 0.88 and 0.98. Since the

resampled correlation coefﬁcients obviously are gaussian distributed, we
can use the mean as a good estimate for the true correlation coefﬁ cient.

mean(rhos1000(:,2))
ans =
0.9315

This value is not much different to our ﬁrst result of r=0.9342. However,
now we can be certain about the validity of this result. However, in our
example, the bootstrap estimate of the correlations from the age-depth data
is quite skewed, as there is a hard upper limit of one. Nevertheless, the boot-
strap method is a valuable tool for obtaining valuable information on the
reliability of Pearson·s correlation coefﬁcient of bivariate data sets.

4.3 Classical Linear Regression Analysis and Prediction

Linear regression provides another way of describing the dependence be-
tween the two variables x and y. Whereas Pearson·s correlation coefﬁ cient
only provides a rough measure of a linear trend, linear models obtained by
regression analysis allow to predict arbitrary y values for any given value
of x within the data range. Statistical testing of the signiﬁcance of the linear

model provides some insights into the quality of prediction.
Classical regression assumes that y responds to x, and the entire disper-
sion in the data set is in the y-value (Fig. 4.4). Then, x is the independent

or regressor or predictor variable. The values of x is deﬁned by the experi-
mentalist and are often regarded as to be free of errors. An example is the
location x of a sample in a sediment core. The dependent variable y contains

71 72 73 74 75 76 77 78 79 80 81