Page 325 - Statistics for Environmental Engineers
P. 325

L1592_frame_C37.fm  Page 334  Tuesday, December 18, 2001  3:20 PM










                                         200


                                        Frequency  100





                                           0
                                             0             25             50            75           100          125          150
                                                      Simulated Sample Variance, s 2


                       FIGURE 37.8  Distribution of 1000 simulated sample variances, each calculated using three observations drawn at random
                                             2
                       from a normal distribution with σ  = 25. The average of the 1000 simulated values is 25.3, with 30 variances above 100
                       and 190 variances of five or less.

                        Figure 37.8 is the histogram of 1000 sample variances, each calculated using three observations drawn
                                                 2
                       from a normal distribution with σ   = 25. The average of the simulated sample variances was 25.3, with
                                                                                     2
                       30 values above 100 and 190 values of five or less. This is the range of variation in   for sample size n = 3.
                                                                                    s i
                        A formal comparison of the equality of two sample variances uses the F statistic. Comparing two samples
                       variances, each estimated with three degrees of freedom, would use the upper 5% value of F 3,3  = 9.28. If
                       the ratio of the larger to the smaller of two variances is less than this F value, the two variances would be
                       considered equal. For F 3,3  = 9.28, this would include variances from 25/9.28 = 2.7 to 25(9.28) = 232.
                        This shows that the variance of repeat observations in a calibration experiment will be quite variable
                       due to random experimental error. If triplicate observations in a calibration experiment did have true
                                      2                                           2
                       constant variance σ   = 25, replicates at one concentration level could have s  = 3, and at another level
                                                                       2
                       (not necessarily a higher concentration) the variance could be s  = 200. Therefore, our interest is not in
                       ‘‘unchanging” variance, but rather in the pattern of change over the range of x or y. If change from one
                       level of y to another is random, the variances are probably just reflecting random sampling error. If the
                       variance increases in proportion to one of the variables, weighted least squares should be used.
                        Making the slopes in Figure 37.7 integer values was justified by saying that the variance is estimated
                       with low precision when there are only three replicates. Box (personal communication) has shown that
                       the percent error in the variance is % error = 100/ 2ν , where ν is the degrees of freedom. From this,
                       about 200 observations of y would be needed to estimate the variance with an error of 5%.




                       Comments

                       Nonconstant variance may occur in a variety of situations. It is common in calibration data because they
                       cover a wide range of concentration, and also because certain measurement errors tend to be multiplicative
                       instead of additive.
                        Using unweighted least squares when there is nonconstant variance will distort all calculated t statistics,
                       confidence intervals, and prediction intervals. It will lead to wrong decisions about the form of the
                       calibration model and which parameters should be included in the model, and give biased estimates of
                       analyte concentrations.
                        The appropriate weights can be determined from the data if replicate measurements have been made
                       at some settings of x. These should be true replicates and not merely multiple measurements on the same
                       standard solution.
                        If there is no replication, one may falsely assume that the variance is constant when it is not. If you
                       suspect nonconstant variance, based on prior experience or knowledge about an instrument, apply reasonable
                       weights. Any reasonable weighting is likely to be better than none.
                       © 2002 By CRC Press LLC
   320   321   322   323   324   325   326   327   328   329   330