Page 72 - Computational Statistics Handbook with MATLAB
P. 72
58 Computational Statistics Handbook with MATLAB
plot(xr,logy,'x')
axis([0 1.1 2.4 4])
xlabel('Reciprocal of Drying Time')
ylabel('Log of Tensile Strength')
We now show how to get the covariance matrix and the correlation coefficient
for these two variables.
% Now get the covariance and
% the correlation coefficient.
cmat = cov(xr,logy);
cormat = corrcoef(xr,logy);
The results are:
cmat =
0.1020 -0.1169
-0.1169 0.1393
cormat =
1.0000 -0.9803
-0.9803 1.0000
Note that the sample correlation coefficient (Equation 3.12) is given by the
ˆ
off-diagonal element of cormat, ρ = – 0.9803 . We see that the variables are
negatively correlated, which is what we expect from Figure 3.1 (the log of the
tensile strength decreases with increasing reciprocal of drying time).
3.3 Sampling Distributions
It was stated in the previous section that we sometimes use a statistic calcu-
lated from a random sample as a point estimate of a population parameter.
For example, we might use X to estimate µ or use S to estimate σ. Since we
are using a sample and not observing the entire population, there will be
some error in our estimate. In other words, it is unlikely that the statistic will
equal the parameter. To manage the uncertainty and error in our estimate, we
must know the sampling distribution for the statistic. The sampling distribu-
tion is the underlying probability distribution for a statistic. To understand
the remainder of the text, it is important to remember that a statistic is a ran-
dom variable.
The sampling distributions for many common statistics are known. For
example, if our random variable is from the normal distribution, then we
know how the sample mean is distributed. Once we know the sampling dis-
tribution of our statistic, we can perform statistical hypothesis tests and cal-
culate confidence intervals. If we do not know the distribution of our statistic,
© 2002 by Chapman & Hall/CRC