Page 56 - Petrology of Sedimentary Rocks
P. 56
tails; 40 heads or 60 heads would be less likely, and so on down to occurrences of IO
heads and 90 tails, which would be very few; a throw of I head and 99 tails would be
exceedingly rare. By tedious computations, one could figure the chances of throwing
any combination of heads and tails, and this is the useful feature of the normal
probability curve: the probabilities fall off at a definite, predictable rate which is fixed
by a mathematical equation. Furthermore, the curve is symmetrical about the mean--a
throw of 34 heads (50-16) is exactly as likely as a throw of 66 heads (50 + 16); and a
throw of I8 heads is just as likely as 82 heads.
Many types of data follow closely this curve which is defined by coin-tossing
experiments and rigidly fixed by an equation. Baseball batting averages; weights of
bolts turned out by a factory; life spans of electric light bulbs; mean daily temperatures
for any month if records are kept over some years; heights of people; densities of
granite samples; widths of brachiopod valves; slope angles of geomorphic features and
many others often follow this normal probability curve, providing enough data is
collected. For example if one chose ten people and weighed them his curve would be
rather irregular; by the time he weighed 1000 people it would be much smoother, and if
he weighed 10,000 the distribution would hew very closely to the normal curve.
Many distributions do not follow the theoretical normal distribution, however.
One of the most common ways that a distribution departs from normality is in its lack
of symmetry. The graph of a normal distribution is a perfectly symmetrical bell-shaped
curve, with equal frequencies on both sides of the most common value (i.e. in the IOO-
coin toss, 25 heads are just as common as 75 heads). Many kinds of data are
asymmetrical, though. Consider the prices of houses in an average American city. The
most common price might be somewhere around $10,000. In this city there might be
many $25,000 homes; in order to have a symmetrical frequency distribution, this would
demand that there be many homes that cost minus $5000! This curve, then would be
highly asymmetrical with the lowest value being perhaps $3000, the peak frequency at
about $10,000, and a long “tail” in the high values going out to perhaps $100,000 or even
more. The distribution of the length of time of long-distance telephone calls is also a
distribution of this type, since most calls last between two and three minutes, very few
are less than one minute but there are some long distance calls lasting as long as I5
minutes or even an hour. The frequency distribution of percentage of insoluble
materials in limestone samples is also a highly asymmetrical or skewed distribution,
with most limestones in one formation having, for example, between 5 and IO percent
insoluble, but some samples having as much as 50 or 75% insoluble, with 0% as the
obvious minimum percentage.
A further way in which distributions depart from normality is that they may have
two or more peak frequencies (termed modes). If one took a large college building and
obtained the frequency distribution of ages of all people in the building at a given time,
he would find a curve that had two peaks (bimodal) instead of one. The highest peak
would be between 19 and 20 (the average age of students who would make up most of
the population), and another peak might occur at around 40 (the average age of the
professors), with a minimum at perhaps 30 (too young for most professors, and too old
for most students). This distribution would be distinctly non-normal; technically, it
would be said to have deficient kurtosis. In geology we would obtain a similarly non-
normal, bimodal distribution if we measured the sizes of crystals in a porphyritic
granite, or if we measured the percentage of quartz in a sand-shale sequence (sand beds
might be almost pure quartz, while the shale beds might have 20% or less).
In analyzing frequency distributions, the most common measures used are the
arithmetic mean ( or average) and the standard deviation ( or degree of scatter about
the mean). Skewness and kurtosis are also used for special purposes (see pages 45-46).
50