Page 70 - MATLAB Recipes for Earth Sciences
P. 70

62                                                  4 Bivariate Statistics

                                          Bivariate Scatter
                      120                  i-th data point ( x i ,y i )
                     Age of sediment (kyrs)  80  Regression line  Regression line:
                      100


                       60
                                                          age = 6.6 + 5.1 depth
                       40
                       20
                                 1  Slope = 5.1           Correlation coefficient:
                                                          r = 0.96
             y-intercept = 6.6
                        0
                         0           5          10         15         20
                                        Depth in sediment (meters)
            Fig. 4.1 Display of a  bivariate data set. The twenty data points represent the age of a sediment
            (in kiloyears before present) in a certain depth (in meters) below the sediment-water interface.
            The joint distribution of the two variables suggests a linear relationship between age and depth,
            i.e., the increase of the sediment age with depth is constant. Pearson·s correlation coeffi cient
            (explained in the text) of r=0.96 supports the strong linear dependency of the two variables.
            Linear regression yields the equation age=6.6+5.1 depth. This equation indicates an increase
            of the sediment age of 5.1 kyrs per meter sediment depth (the slope of the regression line).
            The inverse of the slope is the sedimentation rate of ca. 0.2 meters/kyrs. Furthermore, the

            equation defines the age of the sediment surface of 6.6 kyrs (the intercept of the regression
            line with the y-axis). The deviation of the surface age from zero can be attributed either to
            the statistical uncertainty of regression or any natural process such as erosion or bioturbation.
            Whereas the assessment of the statistical uncertainty will be discussed in this chapter, the
            second needs a careful evaluation of the various processes at the sediment-water interface.



            statistics. They are only a very rough estimate of a rectilinear trend in the
            bivariate data set. Unfortunately the literature is full of examples where  the
            importance of correlation coefficients is overestimated and outliers in the

            data set lead to an extremely biased estimator of the population correlation
            coeffi cient.


               The most popular correlation coefficient is  Pearson·s linear product-mo-
            ment correlation coefficient ρ (Fig. 4.2). We estimate the population·s cor-

            relation coeffi cient ρ from the sample data, i.e., we compute the sample
            correlation coeffi cient r, which is defi ned as
   65   66   67   68   69   70   71   72   73   74   75