Page 174 - Statistics and Data Analysis in Geology
P. 174
Analysis of Multivariate Data
The transformation factor, C-l, must also be calculated to allow use of the x2
approximation:
c-l= 1 - 2*52+3*5-1
6(5+1)(2-1)
= 0.8637
The x2 statistic is approximately 0.1804-0.8637 = 0.1558, with degrees of freedom
equal to v = 1/2(2 - 1)(5)(5 + 1) = 15.
The critical value of x2 for v = 15 with a 5% level of significance is 25.00.
The computed statistic is less than this value and does not fall into the critical
region, so we may conclude that there is nothing in our samples which suggests
that the variance-covariance structures of the parent populations are different. We
may pool the two sample variance-covariance matrices and test the equality of the
multivariate means using the T2 test of Equation (6.33):
1.4847
T2 = - = 14.847
2o
2o
20 + 20
The value 1.4847 is the product of the matrix multiplications D’Sp’D specified in
Equation (6.33). The T2 statistic may be converted to an F-statistic by Equation
(6.34):
Degrees of freedom are v1 = 5 and vz = (20 + 20 - 5 - 1) = 34. The crit-
ical value for F with 5 and 34 degrees of freedom at the 5% (a = 0.05) level of
signhcance is 2.49. Our computed test statistic just exceeds this critical value,
so we conclude that our samples do, indeed, indicate a difference in the means of
the two populations. In other words, there is a statistically significant difference
in composition of water from the two aquifers. This simple test will not pinpoint
the chemical variables responsible for this difference, but it does substantiate the
natives’ contention that they can tell a difference in the water!
Multivariate techniques equivalent to the analysis-of-variance procedures
discussed in Chapter 2 are available. In general, these involve a comparison of
two m x m matrices that are the multivariate equivalents of the among-group and
within-group sums of squares tested in ordinary analysis of variance. The test
statistic consists of the largest eigenvalue of the matrix resulting from the compari-
son. We will not consider these tests here because their formulation is complicated
and their applications to geologic problems have been, so far, minimal. This is
not a reflection on their potential utility, however. Interested readers are referred
to chapter 5 of Griffith and Amrhein (1997), which presents worked examples of
MANOVA’s applied to problems in geography. Koch and Link (1980) include a brief
illustration of the application of multivariate analysis of variance to geochemical
data. Statistical details are discussed by Morrison (1990).
Cluster Analysis
Cluster analysis is the name given to a bewildering assortment of techniques de-
signed to perform classification by assigning observations to groups so each group
is more or less homogeneous and distinct from other groups. This is the special
forte of taxonomists, who attempt to deduce the lineage of living creatures from
487