Page 231 - Matrix Analysis & Applied Linear Algebra
P. 231

226              Chapter 4                                              Vector Spaces

                                    passing through the data points, let ε = Ax − b, and compute the sum of the
                                    squares of the errors to be
                                                     m
                                                        2    T            T
                                                        ε = ε ε =(Ax − b) (Ax − b)= .2.
                                                        i
                                                    i=1


                                                   General Least Squares Problem
                                                  m×n            m
                                       For A ∈        and b ∈  , let ε = ε(x)= Ax − b. The general
                                       least squares problem is to find a vector x that minimizes the quantity
                                                     m
                                                         2   T             T
                                                        ε = ε ε =(Ax − b) (Ax − b).
                                                         i
                                                     i=1
                                       Any vector that provides a minimum value for this expression is called
                                       a least squares solution.

                                       •   The set of all least squares solutions is precisely the set of solutions
                                                                                    T
                                                                           T
                                           to the system of normal equations A Ax = A b.
                                       •   There is a unique least squares solution if and only if rank (A)= n,
                                                                          T    −1  T
                                           in which case it is given by x = A A  A b.
                                       •   If Ax = b is consistent, then the solution set for Ax = b is the
                                           same as the set of least squares solutions.

                                         32                              T
                                    Proof.  First prove that if x minimizes ε ε, then x must satisfy the normal
                                                                      T
                                                               T
                                                            T
                                    equations. Begin by using x A b = b Ax (scalars are symmetric) to write
                                       m
                                          2    T            T            T  T        T  T     T
                                         ε = ε ε =(Ax − b) (Ax − b)= x A Ax − 2x A b + b b.        (4.6.2)
                                          i
                                      i=1
                                    To determine vectors x that minimize the expression (4.6.2), we will again use
                                    minimization techniques from calculus and differentiate the function
                                                                                         T
                                                                       T
                                                                    T
                                                                                   T
                                                                                T
                                                  f(x 1 ,x 2 ,...,x n )= x A Ax − 2x A b + b b     (4.6.3)
                                    with respect to each x i . Differentiating matrix functions is similar to differ-
                                    entiating scalar functions (see Exercise 3.5.9) in the sense that if U =[u ij ],
                                    then
                                     ∂U       ∂u ij  ∂[U + V]   ∂U    ∂V           ∂[UV]    ∂U       ∂V
                                    $   %
                                           =      ,           =     +    ,   and         =     V + U    .
                                      ∂x      ∂x        ∂x      ∂x    ∂x             ∂x     ∂x       ∂x
                                         ij
                                 32
                                    A more modern development not relying on calculus is given in §5.13 on p. 437, but the more
                                    traditional approach is given here because it’s worthwhile to view least squares from both
                                    perspectives.
   226   227   228   229   230   231   232   233   234   235   236