Page 163 - Statistics and Data Analysis in Geology
P. 163

Statistics and Data Analysis in  Geology - Chapter 6









                                           I           I           I
                 1  1  '  1  1  1  1  '  1  ~   1  1  ~  ~  1  1  1  1  1   1  1  1  1  ~  ~  1  ~   ~  1  ~  1
               -335       -340       -345        -350       -355       -360       -365
                                        Raw discriminant scores

              Figure 6-3. Projection of beach and offshore sands onto discriminant  function  line shown
                   in Figure 6-2.  RA is projection of bivariate mean of beach sands, RB is projection of
                   bivariate mean of ofkhore sands, and Ro  is discriminant  index.



              group B  side of  Ro  and a few members of  group B  are located on the group A
              side. These are observations that have been misclassified by the discriminant func-
              tion.  The misclassification ratio, or percent of  observations that the discriminant
              function places into the wrong group, is sometimes taken as an indication of  the
              function's discriminatory power. However, the misclassification ratio is biased and
              can be misleading because it is calculated by reusing the observations that were
              used to estimate the coefficients of  the discriminant function in the first place. It
              seems likely that the function may be less successful in correctly classifying new
              observations.  Reyment and Savazzi (1999) discuss alternative ways of  evaluating
              the goodness of  a discriminant function.
                  We  have  calculated  the  YUW  discriminant  function which yields raw scores
              whose units are products of the units of measurement attached to the original vari-
              ables.  There actually are an infinity of  discriminant functions that will maximize
              the difference between  the two groups, but  all of  these  alternatives are propor-
              tional to the classical, or raw, solution. If A is the vector of  coefficients determined
              by Equation (6.14), then all sets cA (where c is an arbitrary constant), will serve
              equally well. Although different computer programs may yield sets of  coefficients
              that seem to be different, all of  them are proportional to each other.  Alternative
              choices include:

               1. The raw coefficients are divided by the pooled mean squares within groups, or

                                                 c = MSK'
                  where
                                               MSw = A'SA
                  This standardizes the coefficients to dimensionless z-scores.
               2.  The raw coefficients are first divided by MSw, then rescaled by dividing every
                  coefficient by the first coefficient, which becomes equal to 1.
               3.  Each raw coefficient is divided by the square root of  the sum of  the squared
                  raw coefficients. or



                  The sum of the squares of  the transformed coefficients will then be equal to 1.




              476
   158   159   160   161   162   163   164   165   166   167   168