Page 95 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 95

74       2 Presenting and Summarising the Data


              Note that the denominator of φ will ensure a value in the interval [−1, 1] as with
           the correlation coefficient, with +1 representing a perfect positive association and
           –1 a perfect negative association. As a matter of fact the phi coefficient is a special
           case of the Pearson correlation.

           Table 2.11. A general cross table for the bivariate dichotomous case.
                             y 1              y 2                Total
             x 1             a                b                  a + b
             x 2             c                d                  c + d
             Total          a + c            b + d           a + b + c + d


           Example 2.9
           Q: Consider the 2×2 contingency table for the variables SEX and INIT of the
           Freshmen   dataset, shown in Table 2.12. Compute their phi coefficient.
           A: The computed value of phi using 2.26 is 0.15, suggesting a very low degree of
           association. The significance of the phi values will be discussed in Chapter 5.


           Table 2.12. Cross table (obtained with SPSS) of variables SEX and INIT of the
           freshmen dataset.
                                                    INIT              Total
                                              yes          no
           SEX     male     Count              91          5           96
                            % of Total       69.5%        3.8%       73.3%
                   female   Count              30          5           35
                            % of Total       22.9%        3.8%       26.7%
           Total            Count             121          10         131
                            % of Total       92.4%        7.6%       100.0%



           2.3.6.2  The Lambda Statistic

           Another useful measure of association, for multivariate nominal data, attempts to
           evaluate how well one of the variables predicts the outcome of the other variable.
           This measure is applicable to any nominal variables, either dichotomous or not. We
           will explain it using Table 2.4,  by attempting to estimate the contribution  of
           variable SEX in lowering the prediction error of Q4 (“liking to be initiated”). For
           that purpose,  we first note  that if nothing is  known about the sex, the  best
           prediction of the Q4 outcome is the “agree” category, the so-called modal category,
   90   91   92   93   94   95   96   97   98   99   100