Page 231 - Matrix Analysis & Applied Linear Algebra
P. 231
226 Chapter 4 Vector Spaces
passing through the data points, let ε = Ax − b, and compute the sum of the
squares of the errors to be
m
2 T T
ε = ε ε =(Ax − b) (Ax − b)= .2.
i
i=1
General Least Squares Problem
m×n m
For A ∈ and b ∈ , let ε = ε(x)= Ax − b. The general
least squares problem is to find a vector x that minimizes the quantity
m
2 T T
ε = ε ε =(Ax − b) (Ax − b).
i
i=1
Any vector that provides a minimum value for this expression is called
a least squares solution.
• The set of all least squares solutions is precisely the set of solutions
T
T
to the system of normal equations A Ax = A b.
• There is a unique least squares solution if and only if rank (A)= n,
T −1 T
in which case it is given by x = A A A b.
• If Ax = b is consistent, then the solution set for Ax = b is the
same as the set of least squares solutions.
32 T
Proof. First prove that if x minimizes ε ε, then x must satisfy the normal
T
T
T
equations. Begin by using x A b = b Ax (scalars are symmetric) to write
m
2 T T T T T T T
ε = ε ε =(Ax − b) (Ax − b)= x A Ax − 2x A b + b b. (4.6.2)
i
i=1
To determine vectors x that minimize the expression (4.6.2), we will again use
minimization techniques from calculus and differentiate the function
T
T
T
T
T
f(x 1 ,x 2 ,...,x n )= x A Ax − 2x A b + b b (4.6.3)
with respect to each x i . Differentiating matrix functions is similar to differ-
entiating scalar functions (see Exercise 3.5.9) in the sense that if U =[u ij ],
then
∂U ∂u ij ∂[U + V] ∂U ∂V ∂[UV] ∂U ∂V
$ %
= , = + , and = V + U .
∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x
ij
32
A more modern development not relying on calculus is given in §5.13 on p. 437, but the more
traditional approach is given here because it’s worthwhile to view least squares from both
perspectives.