Page 148 - Statistics and Data Analysis in Geology
P. 148
In previous chapters we have considered the analysis of data consisting of only
a single variable measured on each specimen or observational unit. In Chapters 4
and 5 we also considered the influence of the temporal or geographic coordinates of
the sample points. We will now examine techniques for the analysis of multivariate
data, in which each observational unit is characterized by several variables. Multi-
variate methods allow us to consider changes in several properties simultaneously.
Examples of data appropriate for multivariate analysis abound in geology. They
include chemical analyses, where the variables may be percentage compositions
or parts per million of trace elements; measures on streams, such as discharge,
suspended sediment load, depth, dissolved solids, pH, and oxygen content; and
paleontologic variables, perhaps a large number of measurements made on speci-
mens of an organism. Dozens of other examples quickly spring to mind. Some are
simple extensions of problems we have considered previously; others are entirely
new classes of problems.
Multivariate methods are extremely powerful, for they allow the researcher to
manipulate more variables than can otherwise be assimilated. They are compli-
cated, however, both in their theoretical structure and in their operational method-
ology. For some of the procedures, statistical theory and tests have been worked
out only for the most restrictive set of assumptions. The nature and behavior
of the tests under more relaxed, general assumptions (such as those necessary for
most real-world problems) are inadequately known. In fact, some of the procedures
we will consider have no theoretical statistical basis at all, and tests of significance
have yet to be devised. Nevertheless, these methods seem to hold the most promise
for fruitful returns in geological investigations. Most of the problems in geology
involve complex and interacting forces which are impossible to isolate and study
individually. Often a meaningful decision as to the relative worth of one of a num-
ber of possible variables cannot be made. The best course of action frequently is