Page 110 - Modern Analytical Chemistry
P. 110

1400-CH04  9/8/99  3:55 PM  Page 93






                                                                               Chapter 4 Evaluating Analytical Data  93

                     SOLUTION
                     This is an example of a paired data set since the acquisition of samples over an
                     extended period introduces a substantial time-dependent change in the
                     concentration of monensin. The comparison of the two methods must be done
                     with the paired t-test, using the following null and two-tailed alternative
                     hypotheses
                                              –             –
                                          H 0 : d =0  H A:  d ≠ 0
                     Defining the difference between the methods as

                                              d = X elect – X micro
                     we can calculate the difference for each sample
                     Sample  1     2    3    4    5    6    7     8    9    10   11
                       d     2.8  1.4  –3.0  6.0  –6.6  –0.5  9.7  12.7  –1.6  4.0  –0.2
                     The mean and standard deviation for the differences are 2.25 and 5.63,
                     respectively. The test statistic is

                                              dn      . 225 11
                                                                .
                                        t exp =    =          =133
                                               s d      . 563
                     which is smaller than the critical value of 2.23 for t(0.05, 10). Thus, the null
                     hypothesis is retained, and there is no evidence that the two methods yield
                     different results at the stated significance level.


                     A paired t-test can only be applied when the individual differences, d i , belong
                 to the same population. This will only be true if the determinate and indeterminate
                 errors affecting the results are independent of the concentration of analyte in the
                 samples. If this is not the case, a single sample with a larger error could result in a
                 value of d i that is substantially larger than that for the remaining samples. Including
                                           –
                 this sample in the calculation of d and s d leads to a biased estimate of the true mean
                 and standard deviation. For samples that span a limited range of analyte concentra-
                 tions, such as that in Example 4.21, this is rarely a problem. When paired data span
                 a wide range of concentrations, however, the magnitude of the determinate and in-
                 determinate sources of error may not be independent of the analyte’s concentra-
                 tion. In such cases the paired t-test may give misleading results since the paired data
                                                                                 –
                 with the largest absolute determinate and indeterminate errors will dominate d. In
                 this situation a comparison is best made using a linear regression, details of which
                 are discussed in the next chapter.

                 4 5   Outliers
                   F.
                 On occasion, a data set appears to be skewed by the presence of one or more data
                 points that are not consistent with the remaining data points. Such values are called
                 outliers. The most commonly used significance test for identifying outliers is Dixon’s  outlier
                 Q-test. The null hypothesis is that the apparent outlier is taken from the same popula-  Data point whose value is much larger or
                 tion as the remaining data. The alternative hypothesis is that the outlier comes from a  smaller than the remaining data.
                 different population, and, therefore, should be excluded from consideration.
                     The Q-test compares the difference between the suspected outlier and its near-  Dixon’s Q-test
                                                                                         Statistical test for deciding if an outlier
                 est numerical neighbor to the range of the entire data set. Data are ranked from
                                                                                         can be removed from a set of data.
                 smallest to largest so that the suspected outlier is either the first or the last data
   105   106   107   108   109   110   111   112   113   114   115