Page 19 - Statistics for Environmental Engineers
P. 19

L1592_Frame_C02  Page 10  Tuesday, December 18, 2001  1:40 PM











                       The Average, Variance, and Standard Deviation
                       We distinguish between a quantity that represents a population and a quantity that represents a sample.
                       A statistic is a realized quantity calculated from data that are taken to represent a population. A parameter
                       is an idealized quantity associated with the population. Parameters cannot be measured directly unless
                       the entire population can be observed. Therefore, parameters are estimated by statistics. Parameters are
                       usually designated by Greek letters (α, β, γ, etc.) and statistics by Roman letters (a, b, c, etc.). Parameters
                       are constants (often unknown in value) and statistics are random variables computed from data.
                        Given a population of a very large set of  N observations from which the sample is to come, the
                       population mean is η:

                                                           η =  ∑y i
                                                               --------
                                                               N
                       where y i  is an observation. The summation, indicated by ∑, is over the population of N observations. We
                       can also say that the mean of the population is the expected value of y, which is written as E(y) = η,
                       when N is very large.
                        The sample of n observations actually available from the population is used to calculate the sample
                       average:

                                                          y =  1 ∑  y i
                                                              ---
                                                              n
                       which estimates the mean η.
                                                             2
                        The variance of the population is denoted by σ . The measure of how far any particular observation
                       is from the mean η is y i  − η. The variance is the mean value of the square of such deviations taken over
                       the whole population:

                                                             ∑ y i η)  2
                                                              (
                                                                 –
                                                        σ =  -------------------------
                                                         2
                                                                N
                       The standard deviation of the population is a measure of spread that has the same units as the original
                       measurements and as the mean. The standard deviation is the square root of the variance:
                                                              (
                                                             ∑ y i η) 2
                                                                 –
                                                       σ =   -------------------------
                                                                N
                                                                 2
                       The true values of the population parameters σ and σ  are often unknown to the experimenter. They
                       can be estimated by the sample variance:
                                                              (
                                                        s =  ∑ y i – y)  2
                                                         2
                                                            ------------------------
                                                              n 1
                                                                –
                                                     y
                       where n is the size of the sample and   is the sample average. The sample standard deviation is the
                       square root of the sample variance:
                                                             ∑ y i –(  y) 2
                                                        s =  ------------------------
                                                                –
                                                               n 1
                       Here the denominator is n − 1 rather than n. The n − 1 represents the degrees of freedom of the sample.
                       One degree of freedom (the –1) is consumed because the average must be calculated to estimate s. The
                       deviations of n observations from their sample average must sum exactly to zero. This implies that any
                       © 2002 By CRC Press LLC
   14   15   16   17   18   19   20   21   22   23   24