Page 172 - Statistics and Data Analysis in Geology
P. 172
Analysis of Multivariate Data
combined to form a pooled estimate of the population variance-covariance matrix.
The pooled estimate is created by
(6.35)
where ni is the number of observations in the zth group and the summation over
ni gives the total number of all observations in all k samples. This equation is
algebraically equivalent to Equation (6.32) when k = 2.
From the pooled estimate of the population variance-covariance matrix, a test
statistic, M, can be computed:
The test is based on the difference between the logarithm of the determinant of
the pooled variance-covariance matrix and the average of the logarithms of the
determinants of the sample variance-covariance matrices. If all the sample matrices
are the same, this difference will be very small. As the variances and covariances of
the samples deviate more and more from one another, the test statistic will increase.
Tables of critical values of M are not widely available, so the transformation
can be used to convert M to an approximate x2 statistic:
x2 z MC-l (6.38)
The approximate x2 value has degrees of freedom equal to v = (1/2)(k - 1). If all
the samples contain the same number of observations, n, Equation (6.37) can be
simdified to
(6.39)
The x2 approximation is good if the number of k samples and m variables do
not exceed about 5 and each variance-covariance estimate is based on at least 20
observations.
To illustrate the process of hypothesis testing using multivariate statistics, we
will work through the following problem. Note that the number of observations is
just sufficient for some of the approximations to be strictly valid; we will consider
them to be adequate for the purposes of this demonstration.
In a local area in eastern Kansas, all potable water is obtained from wells. Some
of these wells draw from the alluvial fill in stream valleys, while others tap a lime-
stone aquifer that also is the source of numerous springs in the region. Residents
prefer to obtain water from the alluvium, as they feel it is of better quality. How-
ever, the water resources of the alluvium are limited, and it would be desirable for
some users to obtain their supplies from the limestone aquifer.
In an attempt to demonstrate that the two sources are equivalent in quality, a
state agency sampled wells that tapped each source. The water samples were an-
alyzed for chemical compounds that affect the quality of water. Some of the data
485