Page 88 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 88

2.3 Summarising the Data   67


           where s XY, the sample covariance of X and Y, is computed as:

                                      n
              s  XY  = ∑ n  1 = i  (x i  − x  ( ) y i  −  ) y  /( −  ) 1 .  2.19

              Note that the correlation coefficient (also known as Pearson correlation) is a
           dimensionless measure of the degree of linear association of two r.v., with value in
           the interval [−1, 1], with:

               0 :   No linear association (X and Y are linearly uncorrelated);
               1 :   Total linear association, with X and Y varying in the same direction;
              −1:   Total linear association, with X and Y varying in the opposite direction.

              Figure 2.26 shows scatter plots exemplifying several situations of correlation.
           Figure 2.26f illustrates a situation where, although there is an evident association
           between X and Y, the correlation coefficient fails to measure it since X and Y are
           not linearly associated.
              Note that, as described in  Appendix A (section A.8.2), adding a constant or
           multiplying by a constant any or both variables does not change the magnitude of
           the correlation coefficient. Only a change of  sign can  occur  if one of  the
           multiplying constants is negative.
              The correlation coefficients  can  be arranged, in  general, into a symmetrical
           correlation matrix,  where  each element is the correlation coefficient of the
           respective column and row variables.

           Table 2.9. Correlation matrix of five variables of the cork stopper dataset.

                              N        ART        PRT       ARTG        PRTG
             N              1.00       0.80        0.89       0.68       0.72
             ART            0.80       1.00        0.98       0.96       0.97
             PRT            0.89       0.98        1.00       0.91       0.93
             ARTG           0.68       0.96        0.91       1.00       0.99
             PRTG           0.72       0.97        0.93       0.99       1.00

           Example 2.7

           Q: Compute the correlation matrix of the  following five variables  of the  Cork
           Stoppers’ dataset: N, ART, PRT, ARTG, PRTG.
           A: Table 2.9 shows the (symmetric) correlation matrix corresponding to the five
           variables of the cork stopper dataset  (see  Commands 2.9). Notice that the main
           diagonal elements (from the upper left corner to the  right lower corner) are all
           equal to  one. In a later chapter, we  will learn  how to correctly interpret the
           correlation values displayed.
   83   84   85   86   87   88   89   90   91   92   93