Page 65 - MATLAB Recipes for Earth Sciences
P. 65

56                                                 3 Univariate Statistics

            and get

               Freal =
                   3.4967
               Ftable =
                   1.5400

            The F calculated from the data is now larger than the critical F. We therefore
            can reject the null hypothesis. The variances are different on a 95% signifi -
            cance level.



            3.8 The χ –Test
                       2

                  2
            The   χ –test introduced by Karl Pearson (1900) involves the comparison of
            distributions, permitting a test that two distributions were derived from the
            same population. This test is independent of the distribution that is being
            used. It can therefore be applied to test the hypothesis that the observations

            were drawn from a specific theoretical distribution. Let us assume that we
            have a data set that consists of 100 chemical measurements from a sand-
                                       2
            stone unit. We could use the χ –test to test that these measurements can be
            described by a gaussian distribution with a typical or best central value and
            a random dispersion around this value. The n data are grouped in K classes,
            where n should be above 30. The frequencies within the classes O should
                                                                        k
            not be lower than four and never be zero. Then the appropriate statistics is






            where E  are the frequencies expected from the theoretical distribution. The
                    k
            alternative hypothesis is that the two distributions are different. This can be
                                    2
                                                             2
            rejected if the measured χ  is lower than the critical χ , which depends on
            Φ=K-Z, where K is the number of classes and Z is the number of parameters
            describing the theoretical distribution plus the number of variables (for in-
            stance, Z=2+1 for mean and variance in the case of a gaussian distribution
            of a data set containing one variable, Z=1+1 for a Poisson distribution of one
            variable) (Fig. 3.12).
               As an example, we test the hypothesis that our organic carbon measure-
            ments contained in organicmatter.txt have a gaussian distribution. We fi rst
            load the data into the workspace and compute the frequency distribution
            n_exp of the data.
   60   61   62   63   64   65   66   67   68   69   70