Page 72 - Computational Statistics Handbook with MATLAB
P. 72

58                         Computational Statistics Handbook with MATLAB


                                plot(xr,logy,'x')
                                axis([0 1.1 2.4 4])
                                xlabel('Reciprocal of Drying Time')
                                ylabel('Log of Tensile Strength')
                             We now show how to get the covariance matrix and the correlation coefficient
                             for these two variables.

                                % Now get the covariance and
                                % the correlation coefficient.
                                cmat = cov(xr,logy);
                                cormat = corrcoef(xr,logy);
                             The results are:
                                cmat =
                                    0.1020   -0.1169
                                   -0.1169    0.1393
                                cormat =
                                   1.0000   -0.9803
                                   -0.9803    1.0000
                             Note that the sample correlation coefficient (Equation 3.12) is given by the
                                                           ˆ
                             off-diagonal element of cormat, ρ =  – 0.9803  . We see that the variables are
                             negatively correlated, which is what we expect from Figure 3.1 (the log of the
                             tensile strength decreases with increasing reciprocal of drying time).







                             3.3 Sampling Distributions
                             It was stated in the previous section that we sometimes use a statistic calcu-
                             lated from a random sample as a point estimate of a population parameter.
                             For example, we might use X   to estimate µ or use S to estimate σ. Since we
                             are using a sample and not observing the entire population, there will be
                             some error in our estimate. In other words, it is unlikely that the statistic will
                             equal the parameter. To manage the uncertainty and error in our estimate, we
                             must know the sampling distribution for the statistic. The sampling distribu-
                             tion is the underlying probability distribution for a statistic. To understand
                             the remainder of the text, it is important to remember that a statistic is a ran-
                             dom variable.
                              The sampling distributions for many common statistics are known. For
                             example, if our random variable is from the normal distribution, then we
                             know how the sample mean is distributed. Once we know the sampling dis-
                             tribution of our statistic, we can perform statistical hypothesis tests and cal-
                             culate confidence intervals. If we do not know the distribution of our statistic,


                            © 2002 by Chapman & Hall/CRC
   67   68   69   70   71   72   73   74   75   76   77