Page 63 - Petrology of Sedimentary Rocks
P. 63
discontinuity gives 12.5. X2 = (12.5)2/3 I = I56/3 I = 5.0. In the table, for X2 = 5.0 and
d.f. = I, P = .03; hence it is 97% certain that the new rubber is inferior.
There is one serious caution about usinq the X2 test. In case the expected
frequency in any cell is less than 5, this cell musi be combined with another to bring the
total expected frequency for the combined cells over 5. In the sample above, let’s say I
also counted 4 orange tourmalines and 3 yellow ones; in order that no expected cell
frequency be under 5, I would have to lump these with other rare types in the cell
labeled “others.”
Other Techniques. Any statistical text will list many other valuable tests and
techniques. Some of these, of more interest to geologists, are simply mentioned here;
for details, go to the texts, e.g. Miller and Kahn, Snedecor, etc.
Much geologic data can be presented in the form of scatter plots, wherein we wish
to see how one property is related to another. Examples are plots of roundness versus
distance; mean grain size versus sorting; feldspar percentage versus stratigraphic
position; for a collection of dinosaur bones, length of thigh bone versus thickness of the
bones; zircon/tourmaline ratio versus grain size; percent carbonate mud versus round-
ness of shell fragments, etc. To analyze such associations, two main procedures are
applied: (I) the perfection of the association is tested, and (2) the equation of the
relationship is determined.
If the two properties are very closely related, they give a long narrow “train” of
points on a scatter diagram. If the two properties are not associated, a random
“buckshot” pattern emerges. The correlation coefficient, r, computes the perfection of
correlation. For perfect correlation r = 1.00, which means that, knowing one property,
we can predict the other property exactly, and that both increase together. An r of
- I .OO means perfect negative correlation, a correlation just as exact except that as one
property increases the other decreases. If the two properties are not correlated, r may
be .OO; weak correlation would be +.25 or -.l5, etc. Coefficients beyond +.50 are
considered “good” for most geological work. The normal correlation coefficient is valid
only for straightline trends. Other methods must be used for hyperbolic, parabolic,
sinusoidal, etc., trends.
If a small number of data points are available, it is possible for “good-looking”
correlations to arise purely by chance. Thus one should always refer to tables which
show whether the given value of r shows a significant correlation; this depends of
course on the number of samples and the value of r, thus is similar in principle to the t
test.
Squaring the correlation coefficient, r, gives the coefficient of causation r*; this
tells one how much of the variation in one property is explained by the variation in the
other property. For example, if we find that in a series of pebbles, the roundness shows
a correlation coefficient of r = +.60 with grain size, then r* = .36, and we can say that
36% of the variation in roundness is caused by changes in grain size (thus 64% of the
roundness variation would be due to other causes: differences in lithology, distances of
travel, “chance”, etc.). Further analyses may be carried out, such as partial or multiple
correlations, analysis of variance, etc.--see standard texts.
A trend line may be fitted to a scatter diagram, and an equation may be fitted to
this line so that, given a value of one property, the other property may be predicted.
Trend lines can be drawn in by eye, but this process is usually sneered at; a
mathematical way of doing it is the “least squares” method. It is important to realize
57