Page 252 - Applied Statistics And Probability For Engineers
P. 252

c06.qxd  5/14/02  9:56  M  Page 213 RK UL 6 RK UL 6:Desktop Folder:TEMP WORK:MONTGOMERY:REVISES UPLO D CH114 FIN L:Quark Files:






                                                                                   6-7 PROBABILITY PLOTS  213


                                   probability distribution as verifying assumptions. In other cases, the form of the distribution
                                   can give insight into the underlying physical mechanism generating the data. For example, in
                                   reliability engineering, verifying that time-to-failure data come from an exponential distri-
                                   bution identifies the failure mechanism in the sense that the failure rate is constant with
                                   respect to time.
                                       Some of the visual displays we have used earlier, such as the histogram, can provide
                                   insight about the form of the underlying distribution. However, histograms are usually not
                                   really reliable indicators of the distribution form unless the sample size is very large.
                                   Probability plotting is a graphical method for determining whether sample data conform
                                   to a hypothesized distribution based on a subjective visual examination of the data. The gen-
                                   eral procedure is very simple and can be performed quickly. It is also more reliable than the
                                   histogram for small to moderate size samples. Probability plotting typically uses special
                                   graph paper, known as  probability paper, that has been designed for the hypothesized
                                   distribution. Probability paper is widely available for the normal, lognormal, Weibull, and
                                   various chi-square and gamma distributions. We focus primarily on normal probability plots
                                   because many statistical techniques are appropriate only when the population is (at least ap-
                                   proximately) normal.
                                       To construct a probability plot, the observations in the sample are first ranked from
                                   smallest to largest. That is, the sample x , x , p , x n  is arranged as x , x , p , x ,  where
                                                                     1
                                                                                                      1n2
                                                                                            112
                                                                                               122
                                                                       2
                                   x 112  is the smallest observation, x (2) is the second smallest observation, and so forth, with x (n)
                                   the largest. The ordered observations x ( j) are then plotted against their observed cumulative
                                   frequency ( j   0.5) n on the appropriate probability paper. If the hypothesized distribution
                                   adequately describes the data, the plotted points will fall approximately along a straight line;
                                   if the plotted points deviate significantly from a straight line, the hypothesized model is not
                                   appropriate. Usually, the determination of whether or not the data plot as a straight line is
                                   subjective. The procedure is illustrated in the following example.
                 EXAMPLE 6-7       Ten observations on the effective service life in minutes of batteries used in a portable
                                   personal computer are as follows: 176, 191, 214, 220, 205, 192, 201, 190, 183, 185. We
                                   hypothesize that battery life is adequately modeled by a normal distribution. To use probabil-
                                   ity plotting to investigate this hypothesis, first arrange the observations in ascending order and
                                   calculate their cumulative frequencies 1 j   0.52 10  as shown in Table 6-6.
                                       The pairs of values  x 1 j2  and  1 j   0.52 10  are now plotted on normal probability paper.
                                   This plot is shown in Fig. 6-19. Most normal probability paper plots 1001j   0.52 n  on the left
                                   vertical scale and 10031   1 j   0.52 n4  on the right vertical scale, with the variable value  plot-
                                   ted on the horizontal scale. A straight line, chosen subjectively, has been drawn through the plot-
                                   ted points. In drawing the straight line, you should be influenced more by the points near the
                                   middle of the plot than by the extreme points. A good rule of thumb is to draw the line approxi-
                                   mately between the 25th and 75th percentile points. This is how the line in Fig. 6-19 was deter-
                                   mined. In assessing the “closeness” of the points to the straight line, imagine a “fat pencil” lying
                                   along the line. If all the points are covered by this imaginary pencil, a normal distribution ade-
                                   quately describes the data. Since the points in Fig. 6-19 would pass the “fat pencil” test, we con-
                                   clude that the normal distribution is an appropriate model.

                                       A normal probability plot can also be constructed on ordinary graph paper by plotting
                                   the standardized normal scores z against x , where the standardized normal scores satisfy
                                                             j
                                                                     ( j)
                                                             j   0.5
                                                                      P1Z   z 2   1z 2
                                                               n             j     j
   247   248   249   250   251   252   253   254   255   256   257