Page 17 - Statistics and Data Analysis in Geology
P. 17
Statistics and Data Analysis in Geology - Chapter 1
the time of this third edition, there are many easy-to-use interactive programs to
perform almost any desired statistical calculation; these programs have graphi-
cal interfaces and run on personal computers. In addition, there are inexpensive,
specialized programs for geostatistics, for analysis of compositional data, and for
other “nonstandard” procedures of interest to Earth scientists. Some of these are
distributed free or at nominal cost as “shareware.” Computation is no longer among
the major problems facing researchers today; they must be concerned, rather, with
interpretation and the appropriateness of their approach. As a consequence, this
third edition contains many more worked examples and also includes an extensive
library of problem sets accessible over the Internet.
The discussion in the following chapters begins with the basic topics of prob-
ability and elementary statistics, including the special steps necessary to analyze
compositional data, or variables such as chemical analyses and grain-size categories
that sum to a constant. The next topic is matrix algebra. Then we will consider the
analysis of various types of geologic data that have been classified arbitrarily into
three categories: (1) data in which the sequence of observations is important, (2)
data in which the two-dimensional relationships between observations are impor-
tant, and (3) multivariate data in which order and location of the observations are
not considered.
The first category contains all classes of problems in which data have been
collected along a continuum, either of time or distance. It includes time series,
calculation of semivariograms, analysis of stratigraphic sections, and the interpre-
tation of chart recordings such as well logs. The second category includes problems
in which spatial coordinates or geographic locations of samples are important, te.,
studies of shape and orientation, contour mapping, trend-surface analysis, geo-
statistics including kriging, and similar endeavors. The final category is concerned
with clustering, classification, and the examination of interrelations among vari-
ables in which sample locations on a map or traverse are not considered. Paleon-
tological, mineralogical, and geochemical data often are of this type.
The topics proceed from simple to complex. However, each successive topic is
built upon its predecessors, so aspects of multiple regression, covered in Chapter 6,
have been discussed in trend analysis (Chapter 5), which has in turn been preceded
by curvilinear regression (Chapter 4). The basic mathematical procedure involved
has been described under the solution of simultaneous equations (Chapter 3), and
the statistical basis of regression has first been discussed in Chapter 2. Other tech-
niques are similarly developed.
The first topic in the book is elementary statistics. The final topic is canonical
correlation. These two subjects are separated by a wide gulf that would require
several years to bridge following a typical course of study. Obviously, we can-
not cover this span in a single book without omitting a tremendous amount of
material. What has been sacrificed are all but the rudiments of statistical theory
associated with each of the techniques, the details of all mathematical operations
except those that are absolutely essential, and all the embellishments and refine-
ments that typically are added to the basic procedures. What has been retained are
the fundamental algorithms involved in each analysis, discussions of the relations
between quantitative techniques and example applications to geologic problems,
and references to sources for additional details.
4