Page 112 - A First Course In Stochastic Models
P. 112
104 DISCRETE-TIME MARKOV CHAINS
Then it remains true that the long-run average reward per time unit is
f (j)π j
j∈I
with probability 1. This can be directly seen from the proof of Theorem 3.3.3 that
is given in Section 3.5.2. This proof uses the idea that the long-run average reward
per time unit equals
E(reward earned in one cycle)
E(length of one cycle)
with probability 1, where a cycle is defined as the time elapsed between two
successive visits to a given recurrent state. The expression for E(reward earned
during one cycle) is not affected whether f (j) represents an immediate reward or
an expected reward.
Example 3.2.1 (continued) A stock-control problem
Suppose that the following costs are made in the stock-control problem. A fixed
ordering cost of K > 0 is incurred each time the stock is ordered up to level S. In
each week a holding cost of h > 0 is charged against each unit that is still in stock
at the end of the week. A penalty cost of b > 0 is incurred for each demand that
is lost. Denoting by c(j) the expected costs incurred in the coming week when the
current stock on hand is j just prior to review, it follows that
S−1 k ∞ k
−λ λ −λ λ
c(j) = K + h (S − k) e + b (k − S) e , 0 ≤ j < s,
k! k!
k=0 k=S+1
j−1 k ∞ k
−λ λ −λ λ
c(j) = h (j − k) e + b (k − j) e , s ≤ j ≤ S.
k! k!
k=0 k=j+1
S
The long-run average cost per week equals
j=0 c(j)π j with probability 1. In
λ /k! by
evaluating this expression, it is convenient to replace
∞ (j − k) e −λ k
k=j+1
j −λ k
j − λ − (j − k) e λ /k! in the expression for c(j). Note that by taking
k=0
b = 1 and K = h = 0, the long-run average cost per week reduces to the long-
run average demand lost per week. Dividing this average by the average weekly
demand λ we get the long-run fraction of demand that is lost.
Example 3.3.1 An insurance problem
A transport firm has effected an insurance contract for a fleet of vehicles. The
premium payment is due at the beginning of each year. There are four possible
premium classes with a premium payment of P i in class i, where P i+1 < P i for
i = 1, 2, 3. The size of the premium depends on the previous premium and the
claim history during the past year. If no damage is claimed in the past year and
the previous premium is P i , the next premium payment is P i+1 (with P 5 = P 4 ,
by convention), otherwise the highest premium P 1 is due. Since the insurance