Page 48 - Statistics and Data Analysis in Geology
P. 48

Elementary Statistics


                                 Table 2-1.  Chromium content of an Upper
                                     Pennsylvanian shale from  Kansas.
                                        Replicate  Cr (ppm)
                                            1         205
                                            2         255
                                            3         195
                                            4         220
                                            5         -
                                                      235
                                         TOTAL=      1110
                                         MEAN  =  1110/5=222



             than sample medians, hence they are more efficient in estimating the population
             parameter.
                 In geochemical analyses, it is common practice to make multiple determina-
             tions, or replicates, of  a single sample. The most nearly correct analytical value is
             taken to be the mean of  the determinations. Table  2-1  lists five values for chro-
             mium, in parts per million (ppm), obtained by spectrographic analysis of  replicate
             splits of  a Pennsylvanian shale specimen from southeastern Kansas.  The table
             shows the steps in calculating the mean, whose equation is simply


                                                                                    (2.12)

                 Another characteristic of  a distribution curve is the spread or dispersion about
   _-        the mean.  Various measures of  this property have been suggested, but only two
             are used to any extent. One is the variance, and the other is the square root of  the
             variance, called the standard deviation. Variance may be regarded as the average
              squared deviation of  all possible observations from the population mean, and is
              defined bv the eauation
                                                                                    (2.13)
                                                     n
             The variance of a population, u2, is given by this equation. The variance of  a sample
             is denoted by the symbol s2. If the observations XI, XZ, . . . , xn are a random sample
              from a normal distribution, s2 is an efficient estimate of  u2.
                  The reason for using the average of  squared deviations may not be obvious.
              It may seem, perhaps, more logical to define variability as simply the average of
              deviations from the mean, but a few simple trials will demonstrate that this value
             will always equal zero. That is,


                                                                                    (2.14)

                  Another choice might be  the average absolute deviation from the mean, or
              mean deviation, MD:                         -
                                                cz, 1%  -XI
                                          MD =                                      (2.15)
                                                      n
              The vertical bars denote that the absolute value (i.e., without sign) of the enclosed
              quantity is taken.  However, the mean deviation is less efficient than the sample

                                                                                       35
   43   44   45   46   47   48   49   50   51   52   53