Page 159 - Numerical methods for chemical engineering
P. 159
Singular value decomposition (SVD) 145
z is any particular solution satisfying Az = b. To obtain a particular solution from the SVD,
we note that if A were nonsingular, the unique solution would be
−1
T −1
T
x = A b = (W V ) b = V T(−1) −1 W −1 b = V −1 W b (3.250)
where
−1 = diag σ −1 ,...,σ −1 ,σ −1 ,...,σ −1 (3.251)
1 r r+1 N
Written in terms of the left and right singular vectors, the solution is
N
[ j] −1 [ j]
x = w · b σ v (3.252)
j
j=1
T
If there exist zero singular values σ j = 0, −1 and A −1 = V −1 W do not exist, and this
formula diverges due to division by zero. We can, however, define the pseudo (generalized)
−1
inverse of A, in which we replace the infinite values of σ j with zero:
−1
˜ −1
˜ −1
A ˜ −1 = V W T = diag 0,..., 0,σ r+1 ,...,σ N −1 (3.253)
[ j]
For, b ∈ A , b must lie in span{w |σ j > 0}, and as all w [j] are orthonormal, w [j] · b =
0 for all σ j = 0. Therefore, by (3.252), we still have Az = b when b ∈ A ,
N
˜ −1
v
z = w [ j] · b σ −1 [ j] = A b (3.254)
j
j=1
σ j >0
The general solution, for b ∈ A , is then
N N
[ j]
[ j]
−1 [ j]
x = w · b σ j v + c j v c j ∈ (3.255)
j=1 j=1
σ j >0 σ j =0
Least-squares approximate solutions
You may ask, what happens if we apply the pseudo-inverse to a vector b that is not in
the range of A? Such is often the case for an overdetermined system with more equations
N M>N ˜ −1
than unknowns, Ax = b, x ∈ , b ∈ . Then, z = A b is not a solution, Az = b.
However, it is the “closest thing to a solution,” as it minimizes the residual norm,
N
|Az − b|≤|Ay − b| ∀y ∈ (3.256)
˜ −1
2
As this also means minimizing |Az − b| = (Az − b) · (Az − b), z = A b is said to be
˜ −1
the least-squares approximate solution. The SVD solution, z = A b, exists for systems
Ax = b with both square and nonsquare A matrices.
This property makes SVD very useful in statistics. Let us fit a linear model
(3.257)
y = β 1 x 1 + β 2 x 2 + ··· + β N x N
[k]
[k]
[k]
to a data set of M measurements of y, y , for known {x 1 ,..., x N }. From the data, let
M
us form the M × N design matrix X, the response vector y ∈ , and the vector of unknown