Page 340 - Fundamentals of Probability and Statistics for Engineers
P. 340

Model Verification                                              323

           doing  so,  however,  a  complication  arises in  that  theoretical  probabilities p i
           defined by Equation (10.2) are, being functions of the distribution parameters,
           functions of the sample. The statistic D now takes the form

                                  k             2  k   2
                                            ^
                                X   n  N i        X  N i
                            D ˆ     ^       P i  ˆ     ^    n;          …10:10†
                                 iˆ1  P i  n      iˆ1  nP i
                 ^
           where P i  is an  estimator  for  p i  and is thus a statistic. We see that D is now
           a much more complicated function of X 1 , X 2 ,..., X n . The important question
           to be answered is: what is the new distribution of D?
             The problem of determining the limiting distribution of D in this situation
           was first  considered  by Fisher  (1922, 1924), who  showed  that, as n !1 , the
           distribution of D needs to be modified, and the modification obviously depends
           on the method of parameter estimation used. Fortunately, for a class of
           important methods of estimation, such as the maximum likelihood method,
           the modification required is a simple one, namely, statistic D still approaches a
                                                          r
           chi-squared  distribution  as n !1  but  now with  (k      1) degrees of free-
           dom, where r is the number of parameters in the hypothesized distribution to be
           estimated. In other words, it is only necessary to reduce the number of degrees
           of freedom in the limiting distribution defined by Equation (10.5) by one for
           each parameter estimated from the sample.
             We can now state a step-by-step procedure for the case in which r parameters
           in the distribution are to be estimated from the data.
           .  Step 1: divide range space X  into k mutually exclusive and numerically con-
            venient intervals A i , i ˆ  1, ..., k. Let  n i  be the number of sample values fall-
            ing into A i . As a rule, if the number of sample values in any A i  is less than 5,
            combine interval A i  with either A i 1 or A i 1 .
                                                ‡

           .  Step 2: estimate the r parameters by the method of maximum likelihood from
            the data.
           .  Step 3: compute theoretical probabilities P(A i ) ˆ  p i , i ˆ  1, ..., k, by means of
            the hypothesized distribution with estimated parameter values.
           .  Step  4: construct d as given by Equation (10.7).
           .  Step  5:  choose  a  value  of    and  determine  from  Table  A.5  for  the   2
                              r
            distribution  of  (k      1)  degrees  of  freedom  the  value  of   2 k r 1,   . It is

            assumed, of course, that k    r  1 >  0.
           .  Step 6: reject hypothesis H if d >  2  . Otherwise, accept H.

                                           k r 1,

             Example 10.3. Problem: vehicle arrivals at a toll gate on the New York State
           Thruway were recorded. The vehicle counts at one-minute intervals were taken
           for 106 minutes and are given in Table 10.4. On the basis of these observations,
           determine whether a Poisson distribution is appropriate for X, the number of
           arrivals per minute, at the 5% significance level.







                                                                            TLFeBOOK
   335   336   337   338   339   340   341   342   343   344   345