Page 231 - Matrix Analysis & Applied Linear Algebra

P. 231

226 Chapter 4 Vector Spaces

passing through the data points, let ε = Ax − b, and compute the sum of the
squares of the errors to be
m
2 T T
ε = ε ε =(Ax − b) (Ax − b)= .2.
i
i=1

General Least Squares Problem
m×n m
For A ∈ and b ∈ , let ε = ε(x)= Ax − b. The general
least squares problem is to ﬁnd a vector x that minimizes the quantity
m
2 T T
ε = ε ε =(Ax − b) (Ax − b).
i
i=1
Any vector that provides a minimum value for this expression is called
a least squares solution.

• The set of all least squares solutions is precisely the set of solutions
T
T
to the system of normal equations A Ax = A b.
• There is a unique least squares solution if and only if rank (A)= n,
T −1 T
in which case it is given by x = A A A b.
• If Ax = b is consistent, then the solution set for Ax = b is the
same as the set of least squares solutions.

32 T
Proof. First prove that if x minimizes ε ε, then x must satisfy the normal
T
T
T
equations. Begin by using x A b = b Ax (scalars are symmetric) to write
m
2 T T T T T T T
ε = ε ε =(Ax − b) (Ax − b)= x A Ax − 2x A b + b b. (4.6.2)
i
i=1
To determine vectors x that minimize the expression (4.6.2), we will again use
minimization techniques from calculus and diﬀerentiate the function
T
T
T
T
T
f(x 1 ,x 2 ,...,x n )= x A Ax − 2x A b + b b (4.6.3)
with respect to each x i . Diﬀerentiating matrix functions is similar to diﬀer-
entiating scalar functions (see Exercise 3.5.9) in the sense that if U =[u ij ],
then
∂U ∂u ij ∂[U + V] ∂U ∂V ∂[UV] ∂U ∂V
$ %
= , = + , and = V + U .
∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x
ij
32
A more modern development not relying on calculus is given in §5.13 on p. 437, but the more
traditional approach is given here because it’s worthwhile to view least squares from both
perspectives.

226 227 228 229 230 231 232 233 234 235 236