Page 90 - Statistics and Data Analysis in Geology
P. 90

Statistics and Data Analysis in  Geology - Chapter 4

             Can predictions or estimations be made from the data?  Can variables be related
             or their effectiveness measured?  Although such questions may not be explicitly
             posed in each of  the following discussions, you should examine the nature of  the
             methods  and think about their applicability and the type of  problems they may
             help solve.  The sample problems are only suggestions from the many that could
             be used.
                 Geologists are concerned not only with the analysis of  data in sequences, but
             also with the comparison of  two or more sequences. An obvious example is strati-
             graphic correlation, either of  measured sections or petrophysical well logs. A ge-
             ologist's motive for numerical correlation may be a simple desire for speed, as in
             the production of geologic cross-sections from digitized logs stored in data banks.
             Alternatively, he may be faced with a correlation problem where the recognition of
             equivalency is beyond his ability. Subtle degrees of similarity, too slight for unaided
             detection, may provide the clues that will allow him to make a decision where none
             is otherwise possible.  Numerical methods allow the geologist to consider many
             variables simultaneously, a powerful extension of his pattern-recognition facilities.
             Finally, because of  the absolute invariance in operation of  a computer program,
             mathematical correlation provides a challenge to the human interpreter. If a geol-
             ogist's correlation disagrees with that established by computer,  it is the geologist's
             responsibility to determine the reason for the discrepancy. The forced scrutiny may
             reveal complexities or biases not apparent during the initial examination. This is
             not to say that the geologist should unthinkingly bend his interpretation to con-
             form with that of  the computer. However, because modern programs for automatic
             correlation are increasingly able to mimic (and extend) the mental processes of  a
             human interpreter, their output must be considered seriously.
                 Most techniques for comparing two or more sequences can be grouped into two
             broad categories. In the first of  these, the data sequences are assumed to match at
             one position only, and we wish to determine the degree of  similarity between the
             two sequences. An example is the comparison of  an X-ray diffraction chart with
             a set of  standards in an attempt to identify an unknown mineral.  The chart and
             standards can be compared only in one position, where intensities at certain angles
             are compared to intensities of  the standards at the same angles. Nothing is gained,
             for example, by comparing X-ray intensity at 20'28 with the intensity at 30'28  on
             another chart. Although the correspondence may be high, it is meaningless.
                 The fact that  data such as these are in the form of  sequences is irrelevant,
             because each data point is considered to be a separate and distinct variable.  The
             intensity of  diffracted radiation at 20'28 is one variable, and the intensity at 30"28
             is another.  We  will consider methods for the comparisons of  such sequences in
             greater detail in Chapter 6, when we  discuss multivariate measures of  similarity
             and problems of  classification and discrimination. In this class of problems, an ob-
              servation's location in a sequence merely serves to identify it as a specific variable,
              and its location has no other significance.
                  In contrast, some of  the techniques we will discuss in this chapter regard data
              sequences as samples from a continuous string of  possible observations.  There
              is no a pn'ori reason why  one position of  comparison should be better than any
              other. These methods of  cross comparison superficially resemble the mental pro-
              cess of geologic correlation, but have the limitation that they assume the distance
              or time scales of  the two sequences being compared are the same. In historic time
              series and sequences such as Holocene ice cores, this assumption is valid. In other

              162
   85   86   87   88   89   90   91   92   93   94   95