Page 452 - Matrix Analysis & Applied Linear Algebra
P. 452
448 Chapter 5 Norms, Inner Products, and Orthogonality
where ε i is a random variable accounting for the i th observation (or mea-
63
surement) error. It is generally valid to assume that observation errors are not
correlated with each other but have a common variance (not necessarily known)
and a zero mean. In other words, we assume that
2
σ when i = j,
E[ε i ]=0 for each i and Cov[ε i ,ε j ]=
0 when i = j.
y 1 x 11 x 12 ··· x 1n β 1 ε 1
y 2 x 21 x 22 ··· x 2n β 2 ε 2
If y = . , X = . . . . , β = . , ε = . ,
. . . . . . .
. . . . . . .
y m x m1 x m2 ··· x mn β n ε m
then the equations in (5.14.3) can be written as y = X m×n β + ε. In practice,
the points X i∗ at which observations y i are made can almost always be selected
to insure that rank (X m×n )= n, so the complete statement of the standard
linear model is
rank (X)= n,
y = X m×n β + ε such that E[ε]= 0, (5.14.4)
2
Cov[ε]= σ I,
where we have adopted the conventions
E[ε 1 ] Cov[ε 1 ,ε 1 ] Cov[ε 1 ,ε 2 ] ··· Cov[ε 1 ,ε m ]
Cov[ε 2 ,ε 1 ] Cov[ε 2 ,ε 2 ] ···
E[ε 2 ] Cov[ε 2 ,ε m ]
E[ε]= . and Cov[ε]= . . . . .
. . . . .
. . . . .
E[ε m ] Cov[ε m ,ε 1 ]Cov[ε m ,ε 2 ] ··· Cov[ε m ,ε m ]
The problem is to determine the best (minimum variance) linear (linear function
of the y i ’s) unbiased estimators for the components of β. Gauss realized in 1821
that this is precisely what the least squares solution provides.
Gauss–Markov Theorem
For the standard linear model (5.14.4), the minimum variance linear
ˆ
unbiased estimator for β i is given by the i th component β i in the
ˆ
T
T
vector β = X X −1 X y = X y. In other words, the best linear
†
ˆ
unbiased estimator for β is the least squares solution of Xβ = y.
63
In addition to observation and measurement errors, other errors such as modeling errors or
those induced by imposing simplifying assumptions produce the same kind of equation—recall
the discussion of ice cream on p. 228.

