Page 169 - Statistics and Data Analysis in Geology
P. 169

Statistics and Data Analysis in  Geology - Chapter 6

             American statistician who formulated this generalization of  Student’s t. When all
             operations are complete, we find that the test statistic can be expressed as
                                       T~ = n(E-p)‘S-l (z-p)                        (6.30)

             That is, the  arbitrary vector A is equal to the vector of  differences between the
             means, (X - p). We  must find the inverse of  the variance-covariance matrix, pre-
             multiply this inverse by a row vector of differences,  (E - p)’, and then postmultiply
             by a column vector of these same differences.  The test statistic is a multivariate
              extension of  the t-statistic, Hotelling’s T2. Critical values of  T2 can be determined
             by the relation
                                            F=   n-rn    T2                         (6.31)
                                                m(n  - 1)
             where n is the number of observations and rn is the number of variables, allowing us
              to use conventional F-tables rather than special tables of  the T2 distribution. More
              complete discussions of  this and related tests are given in texts on multivariate
              statistics such as Overall and Klett (1983), Harris (1985), Krzanowski (1988), and
              Morrison (1990).
                 Although the expression of  this test in a form such as Equation (6.30) is easy,
              computation of  a test value for an actual data set may be very laborious. For ex-
              ample, suppose we  have measured the content of  four elements in seven lunar
              samples. We wish to test the hypothesis that these samples have been drawn from
              a population having the  same mean  as terrestrial basalts.  Assume we  take our
             values for the populations’ means from the Handbook of Physical Constants (Clark,
              1966, p. 4).  Hotelling’s T2 seems appropriate to test the hypothesis that the vector
              of lunar sample means is no different than the vector of basalt means given in this
             reference.
                 We must first compute the vector of four sample means and the 4 x 4 matrix of
             variances and covariances. The vector of  differences between sample and popula-
              tion means, (P - p), must also be computed. Next, we must find the inverse of  the
             variance-covariance matrix, or S-l. We then must perform two matrix multiplica-
              tions, (E - p)’S-’(JZ - p), and multiply by n to produce T2. From this description,
             you can appreciate that the computational effort becomes increasingly greater as
              the number of  variables grows larger.
                  The data for the seven lunar samples are listed in Table 6-6, with the “popu-
              lation” means from Clark.  Intermediate values in the computation of  T2 are also
              given, with the final test value of  T2 and the equivalent F-statistic, which has m
              and (n - m) degrees of  freedom.  The test statistic of  F  = 73.11 far exceeds the
              critical value of  F4,3,0.01 = 28.71, so we conclude that the mean composition of  the
              sample of  lunar basalts is significantly different than the mean composition of  the
              population of  terrestrial basalts.
                  We have dwelled on the T2 test against a known mean not because this specific
              test has greater utility in geology than other multivariate tests, but to illustrate the
              close relationship between conventional statistics and multivariate statistics. Mul-
              tivariate equivalents can be formulated directly from most univariate tests with the
              proper expansion of  the basic assumptions. However, the transition from ordinary
              algebra to matrix algebra often obscures the underlying similarity between the two
              applications. Although we usually regard multivariate methods as an extension of
              univariate statistics, univariate, or ordinary, statistical analysis should be consid-
              ered as a special subset of  the general area of  multivariate analysis.

              482
   164   165   166   167   168   169   170   171   172   173   174