Page 34 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 34

1.5 Beyond a Reasonable Doubt...   13


              Therefore, the probability density function, f(x), must be such that:
              ∫ D  f ( x) dx  = 1, where D is the domain of the random variable.


              Similarly to the discrete case, the distribution function, F(x), is now defined as:

              F( u)  = P( X  ≤ u)  =  ∫ − u ∞  f ( x) dx .                  1.4

              Sometimes the notations  f X(x) and  F X(x) are used, explicitly indicating the
           random variable to which respect the density and distribution functions.
              The  reader may wish to consult  Appendix  A in  order  to learn more about
           continuous density and  distribution functions.  Appendix B presents several
           important continuous distributions, including the most  popular, the  Gauss (or
           normal) distribution, with density function defined as:
                                  2
                         1   − ( −x  2 ) µ
              n µ ,σ (x ) =  e  2σ  .                                       1.5
                        2 σ
                         π

              This function uses two parameters,  µ and  σ, corresponding to the mean and
           standard deviation,  respectively. In  Appendices  A and B the reader finds a
           description of the most important aspects of the normal distribution, including the
           reason of its broad applicability.



           1.5  Beyond a Reasonable Doubt...

           We often see movies where the jury of a Court has to reach a verdict as to whether
           the accused is found “guilty” or “not guilty”. The verdict must be consensual and
           established beyond any reasonable doubt. And like the trial jury, the statistician has
           also to reach objectively based conclusions, “beyond any reasonable doubt”…
                                                                     “
              Consider, for instance, the dataset of Example 1.3 and the statement  the 100 Ω
           electrical resistances, manufactured by the machine, have a (true) mean value in
                             ”
           the interval [95, 105] . If one could measure all the resistances manufactured by
           the  machine during its whole lifetime, one could compute the  population mean
           (true mean) and assign a True or False value to that statement, i.e., a conclusion
           with  entire certainty  would  then be  established. However, one usually  has only
           available a  sample of  the population;  therefore, the  best one can produce is a
                                “
           conclusion  of  the  type  …  have  a  mean  value  in  the  interval  [95,  105]  with
           probability δ  ; i.e., one has to deal not with total certainty but with a degree of
                      ”
           certainty:

              P(mean ∈[95, 105]) = δ  = 1 – α .

              We call δ  (or 1–α ) the confidence level (α is the error or significance level)
           and  will often present it in percentage  (e.g.  δ  =  95%).  We will learn how to
           establish confidence intervals based on sample statistics (sample mean in the above
   29   30   31   32   33   34   35   36   37   38   39