Page 34 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 34
1.5 Beyond a Reasonable Doubt... 13
Therefore, the probability density function, f(x), must be such that:
∫ D f ( x) dx = 1, where D is the domain of the random variable.
Similarly to the discrete case, the distribution function, F(x), is now defined as:
F( u) = P( X ≤ u) = ∫ − u ∞ f ( x) dx . 1.4
Sometimes the notations f X(x) and F X(x) are used, explicitly indicating the
random variable to which respect the density and distribution functions.
The reader may wish to consult Appendix A in order to learn more about
continuous density and distribution functions. Appendix B presents several
important continuous distributions, including the most popular, the Gauss (or
normal) distribution, with density function defined as:
2
1 − ( −x 2 ) µ
n µ ,σ (x ) = e 2σ . 1.5
2 σ
π
This function uses two parameters, µ and σ, corresponding to the mean and
standard deviation, respectively. In Appendices A and B the reader finds a
description of the most important aspects of the normal distribution, including the
reason of its broad applicability.
1.5 Beyond a Reasonable Doubt...
We often see movies where the jury of a Court has to reach a verdict as to whether
the accused is found “guilty” or “not guilty”. The verdict must be consensual and
established beyond any reasonable doubt. And like the trial jury, the statistician has
also to reach objectively based conclusions, “beyond any reasonable doubt”…
“
Consider, for instance, the dataset of Example 1.3 and the statement the 100 Ω
electrical resistances, manufactured by the machine, have a (true) mean value in
”
the interval [95, 105] . If one could measure all the resistances manufactured by
the machine during its whole lifetime, one could compute the population mean
(true mean) and assign a True or False value to that statement, i.e., a conclusion
with entire certainty would then be established. However, one usually has only
available a sample of the population; therefore, the best one can produce is a
“
conclusion of the type … have a mean value in the interval [95, 105] with
probability δ ; i.e., one has to deal not with total certainty but with a degree of
”
certainty:
P(mean ∈[95, 105]) = δ = 1 – α .
We call δ (or 1–α ) the confidence level (α is the error or significance level)
and will often present it in percentage (e.g. δ = 95%). We will learn how to
establish confidence intervals based on sample statistics (sample mean in the above