Page 278 - A First Course In Stochastic Models

P. 278

CONVERGENCE PROOFS 271

The remainder of the proof uses the same ideas as in the proof of Theorem 3.5.12.
Fix n > N and choose k = N in (6.6.8), where N is the number of states. This
yields

M n − m n ≤ d j {V n−N (j) − V n−N−1 (j)},
j∈I

where d j is a shorthand notation for

d j = [P n−1 · · · P n−N ] i 1 j − [P n · · · P n−N+1 ] i 2 j .
+
−
+
+
Write d = max(d, 0) and d = − min(d, 0). Then d = d −d and d , d ≥ 0.
−
−
Thus

+ −
M n − m n ≤ d {V n−N (j) − V n−N−1 (j)} − d {V n−N (j) − V n−N−1 (j)}
j j
j∈I j∈I

+ − +
≤ M n−N d − m n−N d = (M n−N − m n−N ) d ,
j j j
j∈I j∈I j∈I
by j j + = j j − j j = 0. Next use the
d . This identity is a consequence of
d
d
relation (p − q) = p − min(p, q) to conclude that
+
M n − m n ≤ (M n−N − m n−N )
 

× 1 − min([P n−1 · · · P n−N ] i 1 j , [P n · · · P n−N+1 ] i 2 j  .

j∈I
Now we invoke Lemma 6.7.1. Since the number of states and the number of sta-
tionary policies are both ﬁnite, there is a positive number ρ such that

min{[P (f N ) · · · P (f 1 )] rj , [P (g N ) · · · P (g 1 )] sj } ≥ ρ
j∈I
for any two N-tuples (f N , . . . , f 1 ) and (g N , . . . , g 1 ) of stationary policies and for
any two states r and s. Thus

M n − m n ≤ (1 − ρ)(M n−N − m n−N ).

In Theorem 6.6.3 it was shown that {M n − m n , n ≥ 1} is non-increasing. Thus we
ﬁnd that

M n − m n ≤ (1 − ρ) [n/N] (M 0 − m 0 ), n ≥ 1,

implying the desired result.

273 274 275 276 277 278 279 280 281 282 283