Page 51 - A First Course In Stochastic Models
P. 51

42                    RENEWAL-REWARD PROCESSES

                By the strong law of large numbers for a sequence of independent and identically
                distributed random variables, we have

                                 C 1 + · · · + C n
                             lim              = E(C 1 )  with probability 1.
                            n→∞       n
                Hence, by letting t → ∞ in the above inequality, the desired result follows.
                  Lemma 2.2.2 is also valid when E(C 1 ) = ∞ provided that P {C 1 < ∞} = 1. The
                reason is that the strong law of large numbers for a sequence {C n } of non-negative
                random variables does not require that E(C 1 ) < ∞. Next we prove Theorem 2.2.1.
                Proof of Theorem 2.2.1 For ease, let us first assume that the rewards are non-
                negative. Then, for any t > 0,
                                      N(t)          N(t)+1

                                         R i ≤ R(t) ≤     R i .
                                      i=1             i=1
                This gives
                             N(t)                  N(t)+1

                                R i                     R i
                             i=1     N(t)   R(t)    i=1      N(t) + 1
                                   ×      ≤      ≤         ×         .
                              N(t)     t      t    N(t) + 1      t
                By the strong law of large numbers for the sequence {R n }, we have
                                      n
                                    1
                                lim     R i = E(R 1 )  with probability 1.
                               n→∞ n
                                     i=1
                As pointed out in the proof of Lemma 2.2.2, N(t) → ∞ with probability 1 as
                t → ∞. Letting t → ∞ in the above inequality and using Lemma 2.2.2, the desired
                result next follows for the case that the rewards are non-negative. If the rewards can
                assume both positive and negative values, then the theorem is proved by treating
                the positive and negative parts of the rewards separately. We omit the details.

                  In a natural way Theorem 2.2.1 relates the behaviour of the renewal-reward
                process over time to the behaviour of the process over a single renewal cycle. It is
                noteworthy that the outcome of the long-run average actual reward per time unit
                can be predicted with probability 1. If we are going to run the process over an
                infinitely long period of time, then we can say beforehand that in the long run the
                average actual reward per time unit will be equal to the constant E(R 1 )/E(C 1 ) with
                probability 1. This is a much stronger and more useful statement than the statement
                that the long-run expected average reward per time unit equals E(R 1 )/E(C 1 ) (it
                indeed holds that lim t→∞ E[R(t)]/t = E(R 1 )/E(C 1 ); this expected-value version
                of the renewal-reward theorem is a direct consequence of Theorem 2.2.1 when
                R(t)/t is bounded in t but otherwise requires a hard proof). Also it is noted that for
                the case of non-negative rewards R n the renewal-reward theorem is also valid when
                E(R 1 ) = ∞ (the assumption E(C 1 ) < ∞ cannot be dropped for Theorem 2.2.1).
   46   47   48   49   50   51   52   53   54   55   56