Page 19 - Statistics for Environmental Engineers
P. 19
L1592_Frame_C02 Page 10 Tuesday, December 18, 2001 1:40 PM
The Average, Variance, and Standard Deviation
We distinguish between a quantity that represents a population and a quantity that represents a sample.
A statistic is a realized quantity calculated from data that are taken to represent a population. A parameter
is an idealized quantity associated with the population. Parameters cannot be measured directly unless
the entire population can be observed. Therefore, parameters are estimated by statistics. Parameters are
usually designated by Greek letters (α, β, γ, etc.) and statistics by Roman letters (a, b, c, etc.). Parameters
are constants (often unknown in value) and statistics are random variables computed from data.
Given a population of a very large set of N observations from which the sample is to come, the
population mean is η:
η = ∑y i
--------
N
where y i is an observation. The summation, indicated by ∑, is over the population of N observations. We
can also say that the mean of the population is the expected value of y, which is written as E(y) = η,
when N is very large.
The sample of n observations actually available from the population is used to calculate the sample
average:
y = 1 ∑ y i
---
n
which estimates the mean η.
2
The variance of the population is denoted by σ . The measure of how far any particular observation
is from the mean η is y i − η. The variance is the mean value of the square of such deviations taken over
the whole population:
∑ y i η) 2
(
–
σ = -------------------------
2
N
The standard deviation of the population is a measure of spread that has the same units as the original
measurements and as the mean. The standard deviation is the square root of the variance:
(
∑ y i η) 2
–
σ = -------------------------
N
2
The true values of the population parameters σ and σ are often unknown to the experimenter. They
can be estimated by the sample variance:
(
s = ∑ y i – y) 2
2
------------------------
n 1
–
y
where n is the size of the sample and is the sample average. The sample standard deviation is the
square root of the sample variance:
∑ y i –( y) 2
s = ------------------------
–
n 1
Here the denominator is n − 1 rather than n. The n − 1 represents the degrees of freedom of the sample.
One degree of freedom (the –1) is consumed because the average must be calculated to estimate s. The
deviations of n observations from their sample average must sum exactly to zero. This implies that any
© 2002 By CRC Press LLC