Page 69 - MATLAB Recipes for Earth Sciences
P. 69
4 Bivariate Statistics
4.1 Introduction
Bivariate analysis aims to understand the relationship between two variables
x and y. Examples are the length and the width of a fossil, the sodium and
potassium content of volcanic glass or the organic matter content along a
sediment core. When the two variables are measured on the same object, x is
usually identified as the independent variable, whereas y is the dependent
variable. If both variables were generated in an experiment, the variable
manipulated by the experimentalist is described as the independent variable.
In some cases, both variables are not manipulated and therefore indepen-
dent. The methods of bivariate statistics help to describe the strength of the
relationship between the two variables, either by a single parameter such as
Pearson·s correlation coefficient for linear relationships or by an equation
obtained by regression analysis (Fig. 4.1). The equation describing the rela-
tionship between x and y can be used to predict the y-response from arbitrary
x·s within the range of original data values used for regression. This is of
particular importance if one of the two parameters is difficult to measure. In
this case, the relationship between the two variables is first determined by
regression analysis on a small training set of data. Then the regression equa-
tion is used to calculate this parameter from the fi rst variable.
This chapter first introduces Pearson·s correlation coeffi cient (Chapter 4.2),
then explains the widely-used methods of linear and curvilinear regression
analysis (Chapter 4.3, 4.10 and 4.11). Moreover, a selection of methods is
explained that are used to assess the uncertainties in regression analysis
(Chapters 4.5 to 4.8). All methods are illustrated by means of synthetic ex-
amples since they provide excellent means for assessing the fi nal outcome.
4.2 Pearson·s Correlation Coeffi cient
Correlation coefficients are often used at the exploration stage of bivariate