Page 51 - A First Course In Stochastic Models
P. 51
42 RENEWAL-REWARD PROCESSES
By the strong law of large numbers for a sequence of independent and identically
distributed random variables, we have
C 1 + · · · + C n
lim = E(C 1 ) with probability 1.
n→∞ n
Hence, by letting t → ∞ in the above inequality, the desired result follows.
Lemma 2.2.2 is also valid when E(C 1 ) = ∞ provided that P {C 1 < ∞} = 1. The
reason is that the strong law of large numbers for a sequence {C n } of non-negative
random variables does not require that E(C 1 ) < ∞. Next we prove Theorem 2.2.1.
Proof of Theorem 2.2.1 For ease, let us first assume that the rewards are non-
negative. Then, for any t > 0,
N(t) N(t)+1
R i ≤ R(t) ≤ R i .
i=1 i=1
This gives
N(t) N(t)+1
R i R i
i=1 N(t) R(t) i=1 N(t) + 1
× ≤ ≤ × .
N(t) t t N(t) + 1 t
By the strong law of large numbers for the sequence {R n }, we have
n
1
lim R i = E(R 1 ) with probability 1.
n→∞ n
i=1
As pointed out in the proof of Lemma 2.2.2, N(t) → ∞ with probability 1 as
t → ∞. Letting t → ∞ in the above inequality and using Lemma 2.2.2, the desired
result next follows for the case that the rewards are non-negative. If the rewards can
assume both positive and negative values, then the theorem is proved by treating
the positive and negative parts of the rewards separately. We omit the details.
In a natural way Theorem 2.2.1 relates the behaviour of the renewal-reward
process over time to the behaviour of the process over a single renewal cycle. It is
noteworthy that the outcome of the long-run average actual reward per time unit
can be predicted with probability 1. If we are going to run the process over an
infinitely long period of time, then we can say beforehand that in the long run the
average actual reward per time unit will be equal to the constant E(R 1 )/E(C 1 ) with
probability 1. This is a much stronger and more useful statement than the statement
that the long-run expected average reward per time unit equals E(R 1 )/E(C 1 ) (it
indeed holds that lim t→∞ E[R(t)]/t = E(R 1 )/E(C 1 ); this expected-value version
of the renewal-reward theorem is a direct consequence of Theorem 2.2.1 when
R(t)/t is bounded in t but otherwise requires a hard proof). Also it is noted that for
the case of non-negative rewards R n the renewal-reward theorem is also valid when
E(R 1 ) = ∞ (the assumption E(C 1 ) < ∞ cannot be dropped for Theorem 2.2.1).