Page 55 - Petrology of Sedimentary Rocks
P. 55

A  FEW  STATISTICAL      MEASURES      FOR  USE  IN  SEDIMENTARY        PETROLOGY



           Introduction.   A  good  knowledge   of  statistics   is  becoming   essential   for  anyone  who
    wishes  to  work  in  any  of  the  sciences,   because   the  whole  of  scientific   work  from   laying
    out   the  experiment   to  interpretation   of  data   is  based   on  statistics.   Trying   to  use
    numerical   data  without   a  knowledge   of  statistics   is  like  trying   to  drive   without   a  brake.
    You  never   know  where    you  will  end  up  and  the  odds  are  you  will  end  up  in  the  wrong
    place  and  get  the  wrong  conclusion.

           In  sedimentary    petrography,    statistics   are   used   in  laying   out   the   sampling
    program;   in  determining   the  best  experimental   technique   for  analysis;   in  collecting   the
    analytical   data;   and  in  drawing   correct   geological   conclusions,   such  as:   what   is  the
    content   of  feldspar   in  X  formation?   Within   what   limits   am  I  certain   this  value   is
    correct?    What  is  the  spread  of  values   to  be  expected?   Does  X  formation   have  more
    feldspar   than   Y  formation,   and  how  confident   am  I  of  this?   Does  its  heavy   mineral
    content   differ   significantly   from   that   of  formation   Y?   What  is  the  relation   between
    grain  size  and  zircon   content,   expressed   mathematically?

           This  outline   is  not  intended   to  make  you  an  expert   in  statistics.   It  merely   shows
    examples   of  the  use  of  statistics   in  petrography,   with  the  hope  that  it  will  stimulate   you
    to  take   several   courses   or  read  up  on  your  own.   It  is  super-simplified   and  condensed,
    therefore   omits  a  lot  of  material   that  should  really   be  covered.   For  further   information
    refer   to  any  standard   textbook,   especially   for   geologists:   Miller   and  Kahn,   1962;
    Krumbein    and  Graybill,   1965:  Griffiths,   1967;  Koch  &  Link,   1972;  Davis,   1973.


          The  Normal    Probability   Curve.   In  order  to  understand   some  of  the  assumption   and
    underlying   principles,   it   is  essential   to  study   the   statisticians’   most   fundamental
    concept,   that   of  the   normal   probability   curve.    This   is  the   basis   for   study   of
    experimental   data  of  all  kinds.


          As  a  first   step   in  the  analysis   of  data   from   any  field   of  science,   one  usually
    constructs   a  frequency   distribution.    For  example,    if  one   is  studying   the   batting
    averages   of  baseball   players,   he  would   select   convenient   class  intervals   to  divide   the
    entire   range  of  data  into  about   IO  to  20  classes  and  proceed   to  find  how  many  batters
    had  averages   between   .200  and  .2  IO,  how  many  between   .2  IO  and  .220,  and  so  on;  here
    the  class  interval   would   be  .OlO.   Or  if  an  anthropologist   were  studying   the  lengths   of
    human   thigh   bones,  he  would  first   ascertain   the  spread  between   the  largest   and  smallest
    bone  (say  for  example   12”  to  3l”),   and  divide   this  into  a  convenient   number   of  classes.
    Here  a  convenient   class  interval   would   be  I”,  and  he  would   proceed   to  find  how  many
    thighbones   were  between   12”  and  I3”,  how  many  between   13”  and  l4”,  and  so  on.   When
    data   of  this  type   is  plotted   up  in  the  form   of  a  histogram   or  frequency   curve,   it  is
    usually   found   that   most   of  the  items   are  clustered   around   the  central   part   of  the
    distribution   with   a  rapid  “tailing   off”   in  the  extremes.   For  example,   far  more  baseball
    players   hit  between   .260  and  .280  than  hit  between   .320  and  .340,  or  between   . I80  and
    .200.   Even  less  hit  between   . IO0  and  .I  20,  or  between   .380  and  .400.   A  great   many
    types  of  data  follow   this  distribution,   and  the  type  of  frequency   curve  resulting   is  called
    the  “probability   curve,”   or  the  “normal   curve,”   or  often   a  “Gaussian   curve”   after   Gauss
    who  was  a  pioneer   in  the  field.   The  curve  is  defined   as  the  kind  of  distribution   resulting
    if  one  had  100  well-balanced   coins  and  tossed  them  all  repeatedly   to  count   the  number
    of  heads  appearing.   Naturally,   the  most  frequent   occurrence   would   be  50  heads  and  50





                                                      49
   50   51   52   53   54   55   56   57   58   59   60