Page 235 - Applied statistics and probability for engineers
P. 235

Section 6-3/Frequency Distributions and Histograms     213


                     6-3  Frequency Distributions and Histograms

                                         A frequency distribution is a more compact summary of data than a stem-and-leaf diagram. To
                                         construct a frequency distribution, we must divide the range of the data into intervals, which are
                                         usually called class intervals, cells, or bins. If possible, the bins should be of equal width in order
                                         to enhance the visual information in the frequency distribution. Some judgment must be used in
                                         selecting the number of bins so that a reasonable display can be developed. The number of bins
                                         depends on the number of observations and the amount of scatter or dispersion in the data. A fre-
                                         quency distribution that uses either too few or too many bins will not be informative. We usually

                                         ind that between 5 and 20 bins is satisfactory in most cases and that the number of bins should
                                         increase with n. Several sets of rules can be used to determine the member of bins in a histogram.
                                         However, choosing the number of bins approximately equal to the square root of the number of
                                         observations often works well in practice.
                                            A frequency distribution for the comprehensive strength data in Table 6-2 is shown in
                                         Table 6-4. Because the data set contains 80 observations, and because  80 ≃  9, we suspect
                                         that about eight to nine bins will provide a satisfactory frequency distribution. The largest and
                                         smallest data values are 245 and 76, respectively, so the bins must cover a range of at least
                                                 =
                                         245 −  76 169  units on the psi scale. If we want the lower limit for the i rst bin to begin
                                         slightly below the smallest data value and the upper limit for the last bin to be slightly above
                                         the largest data value, we might start the frequency distribution at 70 and end it at 250. This
                                         is an interval or range of 180 psi units. Nine bins, each of width 20 psi, give a reasonable fre-
                                         quency distribution, so the frequency distribution in Table 6-4 is based on nine bins.
                      Choosing the Number   The second row of Table 6-4 contains a relative frequency distribution. The relative fre-
                      of Bins in a Frequency   quencies are found by dividing the observed frequency in each bin by the total number of
                      Distribution or Histo-  observations. The last row of Table 6-4 expresses the relative frequencies on a cumulative
                         gram is Important  basis. Frequency distributions are often easier to interpret than tables of data. For example,
                                         from Table 6-4, it is very easy to see that most of the specimens have compressive strengths
                                         between 130 and 190 psi and that 97.5 percent of the specimens fall below 230 psi.
                                            The histogram is a visual display of the frequency distribution. The steps for constructing
                                         a histogram follow.

                           Constructing a
                         Histogram (Equal    (1)  Label the bin (class interval) boundaries on a horizontal scale.
                              Bin Widths)    (2)   Mark and label the vertical scale with the frequencies or the relative frequencies.
                                             (3)   Above each bin, draw a rectangle where height is equal to the frequency (or rela-
                                                tive frequency) corresponding to that bin.


                                            Figure 6-7 is the histogram for the compression strength data. The histogram, like the stem-
                                         and-leaf diagram, provides a visual impression of the shape of the distribution of the meas-
                                         urements and information about the central tendency and scatter or dispersion in the data.

                         5    6-4   Frequency Distribution for the Compressive Strength Data in Table 6-2
                      Class     70 Ä < 90 90 Ä <x  110 110 Ä <x  130 130 Ä <x  150 150 Ä < 170 170 Ä < 190 190 Ä <  210 210 Ä < 230 230 Ä < 250
                                   x
                                                                                                                x
                                                                         x
                                                                                   x
                                                                                             x
                                                                                                       x
                      Frequency    2        3         6        14        22        17       10         4        2
                      Relative    0.0250  0.0375    0.0750    0.1750   0.2750    0.2125    0.1250    0.0500   0.0250
                        frequency
                      Cumulative   0.0250  0.0625   0.1375    0.3125   0.5875    0.8000    0.9250    0.9750   1.0000
                        relative
                        frequency
   230   231   232   233   234   235   236   237   238   239   240