Page 264 - Applied statistics and probability for engineers

P. 264

242 Chapter 7/Point Estimation of Parameters and Sampling Distributions

average ill volume to be x = 298 8. milliliters. The engineer will probably decide that the popula-
tion mean is μ = 300 milliliters even though the sample mean was 298.8 milliliters because he or
she knows that the sample mean is a reasonable estimate of μ and that a sample mean of 298.8
milliliters is very likely to occur even if the true population mean is μ = 300 milliliters. In fact, if
the true mean is 300 milliliters, tests of 25 containers made repeatedly, perhaps every ive min-
utes, would produce values of x that vary both above and below μ = 300 milliliters.
The link between the probability models in the earlier chapters and the data is made as
follows. Each numerical value in the data is the observed value of a random variable. Further-
more, the random variables are usually assumed to be independent and identically distributed.
These random variables are known as a random sample.
Random Sample
The random variables X X 2 , ... , X n are a random sample of size n if (a) the X i ’s are
1 ,
independent random variables and (b) every X i has the same probability distribution.

The observed data are also referred to as a random sample, but the use of the same phrase
should not cause any confusion.
The assumption of a random sample is extremely important. If the sample is not random
and is based on judgment or is lawed in some other way, statistical methods will not work
properly and will lead to incorrect decisions.
The primary purpose in taking a random sample is to obtain information about the unknown
population parameters. Suppose, for example, that we wish to reach a conclusion about the
proportion of people in the United States who prefer a particular brand of soft drink. Let p rep-
resent the unknown value of this proportion. It is impractical to question every individual in the
population to determine the true value of p. To make an inference regarding the true proportion
p, a more reasonable procedure would be to select a random sample (of an appropriate size)
and use the observed proportion ˆ p of people in this sample favoring the brand of soft drink.
The sample proportion, ˆ p, is computed by dividing the number of individuals in the sam-
ple who prefer the brand of soft drink by the total sample size n. Thus, ˆ p is a function of the
observed values in the random sample. Because many random samples are possible from a
population, the value of ˆ p will vary from sample to sample. That is, ˆ p is a random variable.
Such a random variable is called a statistic.
Statistic
A statistic is any function of the observations in a random sample.

,
We have encountered statistics before. For example, if X X ,... , X n is a random sample of size
2
2
n, the sample mean X, the sample variance S , and the sample standard deviation S are
statistics. Because a statistic is a random variable, it has a probability distribution.
Sampling
Distribution The probability distribution of a statistic is called a sampling distribution.

For example, the probability distribution of X is called the sampling distribution of the
mean. The sampling distribution of a statistic depends on the distribution of the population,
the size of the sample, and the method of sample selection. We now present perhaps the most
important sampling distribution. Other sampling distributions and their applications will be
illustrated extensively in the following two chapters.
Consider determining the sampling distribution of the sample mean X. Suppose that a
2
random sample of size n is taken from a normal population with mean μ and variance σ .
Now each observation in this sample, say, X X 2 ,… , X n , is a normally and independently
1 ,

259 260 261 262 263 264 265 266 267 268 269