Page 172 - Statistics and Data Analysis in Geology
P. 172

Analysis of Multivariate Data

             combined to form a pooled estimate of the population variance-covariance matrix.
             The pooled estimate is created by

                                                                                   (6.35)

             where ni is the number of  observations in the zth group and the summation over
             ni  gives the total number  of  all observations  in all k  samples.  This equation is
             algebraically equivalent to Equation (6.32) when k  = 2.
                 From the pooled estimate of  the population variance-covariance matrix, a test
             statistic, M, can be computed:





             The test is based on the difference between the logarithm of  the determinant of
             the pooled variance-covariance matrix  and the average of  the logarithms  of  the
             determinants of the sample variance-covariance matrices. If all the sample matrices
             are the same, this difference will be very small. As the variances and covariances of
             the samples deviate more and more from one another, the test statistic  will increase.
             Tables of  critical values of M are not widely available, so the transformation







             can be used to convert M to an approximate x2 statistic:

                                              x2 z MC-l                            (6.38)
             The approximate x2 value has degrees of  freedom equal to v = (1/2)(k - 1). If  all
             the samples contain the same number of  observations, n, Equation (6.37) can be
             simdified to

                                                                                   (6.39)

             The x2 approximation is good if  the number  of  k  samples and m variables  do
             not exceed about 5  and each variance-covariance estimate is based on at least 20
             observations.
                 To illustrate the process of hypothesis testing using multivariate statistics, we
             will work through the following problem. Note that the number of  observations is
             just sufficient for some of  the approximations to be strictly valid; we will consider
             them to be adequate for the purposes of  this demonstration.
                 In a local area in eastern Kansas, all potable water is obtained from wells. Some
             of  these wells draw from the alluvial fill in stream valleys, while others tap a lime-
             stone aquifer that also is the source of  numerous springs in the region. Residents
             prefer to obtain water from the alluvium, as they feel it is of  better quality.  How-
             ever, the water resources of  the alluvium are limited, and it would be desirable for
             some users to obtain their supplies from the limestone aquifer.
                 In an attempt to demonstrate that the two sources are equivalent in quality, a
             state agency sampled wells that tapped each source. The water samples were an-
             alyzed for chemical compounds that affect the quality of water.  Some of  the data

                                                                                      485
   167   168   169   170   171   172   173   174   175   176   177