Page 180 - Statistics II for Dummies
P. 180

164
                       Part III: Analyzing Variance with ANOVA
                                  Breaking down the variance

                                  into sums of squares


                                  Step one of the F-test is splitting up the variability in the y variable into
                                  portions that define where the variability is coming from. Each portion of
                                  variability is called a sum of squares. The term analysis of variance is a great
                                  description for exactly how you conduct a test of k population means. With
                                  the overall goal of testing whether k population (or treatment) means are
                                  equal, you take a random sample from each of the k populations.

                                  You first put all the data together into one big group and measure how much
                                  total variability there is; this variability is called the sums of squares total, or
                                  SSTO. If the data are really diverse, SSTO is large. If the data are very similar,
                                  SSTO is small.

                                  You can split the total variability in the combined data set (SSTO) into two
                                  parts:

                                   ✓ SST: The variability between the groups, known as the sums of squares
                                      for treatment
                                   ✓ SSE: The variability within the groups, known as the sums of squares
                                      for error

                                  Splitting up the variability in your data results in one of the most important
                                  equalities in ANOVA:

                                      SSTO = SST + SSE
                                                                                    2
                                  The formula for SSTO is the numerator of the formula for s , the variance of a
                                                                                            th
                                 single data set, so               , where i and j represent the j  value in
                                                    th
                                 the sample from the i  population and   is the overall sample mean (the mean
                                  of the entire data set). So, in terms of ANOVA, SSTO is the total squared dis-
                                  tance between the data values and their overall mean.
                                  The formula for SST is        , where n  is the size of the sample
                                                                        i
                                                 th
                                  coming from the i  population and   is the overall sample mean. SST repre-
                                  sents the total squared distance between the means from each sample and
                                  the overall mean.

                                                                                th
                                  The formula for SSE is         , where x  is the j  value in the
                                                                         ij
                                                 th
                                  sample from the i  population and   is the mean of the sample coming
                                          th
                                  from the i  population. This formula represents the total squared distance
                                  between the values in each sample and their corresponding sample means.
                                  Using algebra, you can confirm (with some serious elbow grease) that SSTO =
                                  SST + SSE.





                                                                                                       7/23/09   9:31:29 PM
           15_466469-ch09.indd   164                                                                   7/23/09   9:31:29 PM
           15_466469-ch09.indd   164
   175   176   177   178   179   180   181   182   183   184   185