Page 278 - A First Course In Stochastic Models
P. 278

CONVERGENCE PROOFS                       271

                  The remainder of the proof uses the same ideas as in the proof of Theorem 3.5.12.
                Fix n > N and choose k = N in (6.6.8), where N is the number of states. This
                yields

                               M n − m n ≤   d j {V n−N (j) − V n−N−1 (j)},
                                          j∈I

                where d j is a shorthand notation for

                              d j = [P n−1 · · · P n−N ] i 1 j − [P n · · · P n−N+1 ] i 2 j .
                                                               +
                                       −
                      +
                                                                          +
                Write d = max(d, 0) and d = − min(d, 0). Then d = d −d and d , d ≥ 0.
                                                                             −
                                                                    −
                Thus

                                +                           −
                 M n − m n ≤   d {V n−N (j) − V n−N−1 (j)} −  d {V n−N (j) − V n−N−1 (j)}
                               j                            j
                           j∈I                          j∈I

                                     +            −                      +
                         ≤ M n−N    d − m n−N    d = (M n−N − m n−N )   d ,
                                     j            j                      j
                                 j∈I          j∈I                    j∈I
                by    j j +  =    j j −                         j j = 0. Next use the
                              d . This identity is a consequence of
                                                                 d
                      d
                relation (p − q) = p − min(p, q) to conclude that
                             +
                    M n − m n ≤ (M n−N − m n−N )
                                                                           

                               × 1 −     min([P n−1 · · · P n−N ] i 1 j , [P n · · · P n−N+1 ] i 2 j    .
                                 
                                     j∈I
                Now we invoke Lemma 6.7.1. Since the number of states and the number of sta-
                tionary policies are both finite, there is a positive number ρ such that

                              min{[P (f N ) · · · P (f 1 )] rj , [P (g N ) · · · P (g 1 )] sj } ≥ ρ
                           j∈I
                for any two N-tuples (f N , . . . , f 1 ) and (g N , . . . , g 1 ) of stationary policies and for
                any two states r and s. Thus

                                  M n − m n ≤ (1 − ρ)(M n−N − m n−N ).

                In Theorem 6.6.3 it was shown that {M n − m n , n ≥ 1} is non-increasing. Thus we
                find that

                              M n − m n ≤ (1 − ρ) [n/N] (M 0 − m 0 ),  n ≥ 1,

                implying the desired result.
   273   274   275   276   277   278   279   280   281   282   283