Page 178 - Matrices theory and applications
P. 178
9.5. The Method of the Conjugate Gradient
One says that the vectors p j are pairwise conjugate. Of course, conjugation
means A-orthogonality. This explains the name of the method.
The quadratic function J, strictly convex, reaches its infimum on the
affine subspace x 0 + H k at a unique vector, which we denote by x k .This
notation makes sense for k =0. If x = y+γp k ∈ x 0 +H k+1 with y ∈ x 0 +H k ,
then
1 161
J(x) = J(¯x)+ E(x − ¯x)
2
1 1 2
= J(¯x)+ E(y − ¯x)+ γ E(p k )+ γ Ap k ,y − ¯x
2 2
1 2
= J(y)+ γ E(p k ) − γ Ap k ,e 0
,
2
since Ap k ,y − x 0
= 0. Hence, minimizing J over x 0 + H k+1 amounts to
1 2
minimizing J over x 0 + H k , together with minimizing γ → γ E(p k ) −
2
γ p k ,r 0
over IR. We therefore have
x k+1 − x k ∈ IRp k . (9.5)
By definition of l there exists a nonzero polynomial P of degree l such
that P(A)r 0 =0, that is, AP(A)e 0 =0. Since A is invertible, P(A)e 0 =0.
Let us assume that P(0) vanishes. Then P(X)= XQ(X)with deg Q = l−1.
Therefore, Q(A)r 0 =0: The map S → S(A)r 0 is not one-to-one over the
polynomials of degree less than or equal to l − 1. Hence dim H l <l,a
contradiction. Hence P(0) = 1, and we may assume that P(0) = 1. Then
P(X)= 1 − XR(X), where deg R = l − 1. Thus e 0 = R(A)r 0 ∈H l or,
equivalently, ¯x ∈ x 0 + H l .Conversely, if k ≤ l and ¯x ∈ x 0 + H k ,then
e 0 ∈H k ;thatis, e 0 = Q(A)r 0 , where deg Q ≤ k − 1. Then Q 1(A)e 0 =0,
because Q 1 (X)= 1 − XQ(X). Therefore, Q 1 (A)r 0 =0, Q 1 (0) =0, and
deg Q 1 ≤ k. Hence k ≥ l;that is, k = l. Summing up, we have ¯x ∈ x 0 + H l
but ¯x ∈ x 0 + H l−1 . Therefore, x l =¯ and x k =¯x if k< l.
x
Lemma 9.5.1 Let us denote by λ n ≥ ··· ≥ λ 1 (> 0) the eigenvalues of A.
If k ≤ l,then
2
x
E(x k − ¯) ≤ E(e 0 ) · min max |1+ λ j Q(λ j )| .
deg Q≤k−1 j
Proof
Let us compute
E(x k − ¯x)=min{E(x − ¯x) | x ∈ x 0 + H k }
=min{E(e 0 + y) | y ∈H k }
=min{E((I n + AQ(A))e 0 ) | deg Q ≤ k − 1}
2
=min{ (I n + AQ(A))A 1/2 e 0 | deg Q ≤ k − 1},
2