Page 453 - Matrix Analysis & Applied Linear Algebra
P. 453

5.14 Why Least Squares?                                                            449

                                                         ˆ
                                                               †
                                    Proof.  It is clear that β = X y is a linear estimator of β because each com-
                                           ˆ
                                                     †
                                    ponent β i =     [X ] ik y k is a linear function of the observations. The fact that
                                                 k
                                    ˆ
                                    β is unbiased follows by using the linear nature of expected value to write
                                               E[y]= E[Xβ + ε]= E[Xβ]+ E[ε]= Xβ + 0 = Xβ,
                                    so that
                                              ˆ
                                                                                 T    −1  T
                                                              †
                                                       †
                                                                       †
                                            E β = E X y = X E[y]= X Xβ = X X           X Xβ = β.
                                                 ˆ
                                                      †
                                    To argue that β = X y has minimal variance among all linear unbiased estima-
                                                  ∗
                                    tors for β, let β be an arbitrary linear unbiased estimator for β. Linearity of
                                                                                     ∗
                                     ∗
                                    β implies the existence of a matrix L n×m such that β = Ly, and unbiased-
                                                      ∗
                                    ness insures β = E[β ]= E[Ly]= LE[y]= LXβ. We want β = LXβ to hold
                                    irrespective of the values of the components in β, so it must be the case that
                                    LX = I n (recall Exercise 3.5.5). For i  = j we have
                                                                        =⇒ E[ε i ε j ]= E[ε i ]E[ε j ]=0,
                                          0=Cov[ε i ,ε j ]= E[ε i ε j ] − µ ε i  µ ε j
                                    so that
                                                              2      2             2
                                                             ) ]= E[ε ]=Var[ε i ]= σ  when i = j,
                                                   E[(y i − µ y i    i
                                      Cov[y i ,y j ]=                                             (5.14.5)
                                                                      )] = E[ε i ε j ]= 0  when i  = j.
                                                   E[(y i − µ y i  )(y j − µ y j
                                                                                         2
                                                                               2
                                    This together with the fact that Var[aW +bZ]= a Var[W]+b Var[Z] whenever
                                    Cov[W, Z]= 0 allows us to write
                                                                  ;        <
                                                                    m             m
                                                                                                2
                                                                                          2
                                                ∗
                                           Var[β ]=Var[L i∗ y]=Var     l ik y k = σ 2  l 2 ik  = σ  L i∗   .
                                                i
                                                                                                2
                                                                   k=1           k=1
                                    Since LX = I, it follows that Var[β ]is minimal if and only if L i∗ is the
                                                                      ∗
                                                                      i
                                                                       T
                                                                             T
                                    minimum norm solution of the system z X = e . We know from (5.12.17) that
                                                                             i
                                                                                     T
                                                                                T
                                                                                             †
                                    the (unique) minimum norm solution is given by z = e X = X , so Var[β ]
                                                                                        †
                                                                                                        ∗
                                                                                     i       i∗        i
                                                                †
                                    is minimal if and only if L i∗ = X . Since this holds for i =1, 2,...,m, it follows
                                                                i∗
                                                                                  ˆ
                                               †
                                    that L = X . In other words, the components of β = X y are the (unique)
                                                                                       †
                                    minimal variance linear unbiased estimators for the parameters in β.
                   Exercises for section 5.14
                                   5.14.1. Fora matrix Z m×n =[z ij ], of random variables, E[Z]is defined to be
                                           the m × n matrix whose (i, j)-entry is E[z ij ]. Consider the standard
                                           linear model described in (5.14.4), and let ˆ e denote the vector of random
                                                                                 ˆ

                                                                      ˆ
                                                                                                T
                                                                                        T
                                           variables defined by ˆ e = y − Xβ in which β = X X   −1 X y = X y.
                                                                                                      †
                                           Demonstrate that
                                                                            T
                                                                           ˆ e ˆ e
                                                                      2
                                                                     ˆ σ =
                                                                          m − n
                                                                                             T
                                                                      2
                                                                                T
                                           is an unbiased estimator for σ . Hint: d c = trace(cd ) for column
                                           vectors c and d, and, by virtue of Exercise 5.9.13,

                                             trace I − XX †  = m − trace XX †  = m − rank XX †  = m − n.
   448   449   450   451   452   453   454   455   456   457   458