Page 104 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 104

3.1 Point Estimation and Interval Estimation   83


              When estimating a data parameter the point estimate is usually insufficient. In
           fact, in all the cases that the point estimator is characterised  by a probability
           density function the probability that the point estimate actually equals the true
           value of the parameter is zero. Using the spring scales  analogy, we see that no
           matter how accurate and  precise the scales are, the  probability of obtaining the
           exact weight (with arbitrary large number of digits) is zero. We need, therefore, to
           attach some measure of the possible error of the estimate to the point estimate. For
           that purpose,  we attempt to determine an interval, called  confidence interval,
           containing the true parameter value θ  with a given probability 1– α, the so-called
           confidence level:

              P (t  1 , n  (x ) < θ < t  2 , n  (x ) )=1 − α ,              3.1

           where α is a confidence risk.
              The endpoints of the interval (also known as confidence limits), depend on the
           available sample and are determined taking into account the sampling distribution:

              F T  (x ) ≡  F  n t  (X ) (x ) .

              We have assumed that the interval endpoints are finite, the so-called two-sided
           (or two-tail) interval estimation. Sometimes we will also use one-sided (or one-
           tail) interval estimation by setting t  1 , n  (x ) =  −∞ or  t  2 , n  (x ) =  +∞ .
              Let us now apply these ideas to the  spring scales example.  Imagine that, as
           happens  with unbiased point estimators,  there were no systematic  error  and
           furthermore the measured errors follow a known normal distribution; therefore, the
           measurement error is a one-dimensional random variable distributed as N 0,σ , with
           known σ. In other words, the distribution function of the random weight variable,
           W,  is  F W  (w ) ≡  F (w ) =  N ω ,σ (w ) .  We are now able to  determine the two-sided
           95% confidence interval of ω, given a measurement w, by first noticing, from the
           normal distribution tables, that  the percentile  97.5% (i.e., 100–α/2, with  α in
           percentage) corresponds to 1.96σ:
              Thus:

              F (w ) =  . 0 975 ⇒  w  . 0  975  =  . 1  96 σ .              3.2

              Given the symmetry of the normal distribution, we have:

                                          ω
                                                 σ
                w
              P ( < ω +  . 1 96σ ) =  . 0 975 ⇒  P ( −  . 1  96 < w < ω +  . 1 96σ ) =  . 0  95 ,

           leading to the following 95% confidence interval:

                −
                             +
              ω 1 . 96 σ < w < ω 1 . 96 σ .                                 3.3

              Hence, we expect that in a long run of measurements 95% of them will be inside
           the ω ± 1.96σ  interval, as shown in Figure 3.2a.
              Note that the inequalities 3.3 can also be written as:
   99   100   101   102   103   104   105   106   107   108   109