Page 112 - A First Course In Stochastic Models
P. 112

104                   DISCRETE-TIME MARKOV CHAINS

                Then it remains true that the long-run average reward per time unit is  
  f (j)π j
                                                                          j∈I
                with probability 1. This can be directly seen from the proof of Theorem 3.3.3 that
                is given in Section 3.5.2. This proof uses the idea that the long-run average reward
                per time unit equals

                                     E(reward earned in one cycle)
                                        E(length of one cycle)

                with probability 1, where a cycle is defined as the time elapsed between two
                successive visits to a given recurrent state. The expression for E(reward earned
                during one cycle) is not affected whether f (j) represents an immediate reward or
                an expected reward.


                Example 3.2.1 (continued) A stock-control problem
                Suppose that the following costs are made in the stock-control problem. A fixed
                ordering cost of K > 0 is incurred each time the stock is ordered up to level S. In
                each week a holding cost of h > 0 is charged against each unit that is still in stock
                at the end of the week. A penalty cost of b > 0 is incurred for each demand that
                is lost. Denoting by c(j) the expected costs incurred in the coming week when the
                current stock on hand is j just prior to review, it follows that
                               S−1           k      ∞             k
                                          −λ  λ               −λ  λ
                    c(j) = K + h  (S − k) e    + b     (k − S) e   ,  0 ≤ j < s,
                                            k!                   k!
                               k=0                k=S+1
                           j−1           k     ∞             k
                                     −λ λ                 −λ  λ
                    c(j) = h  (j − k) e   + b     (k − j) e   ,  s ≤ j ≤ S.
                                        k!                  k!
                           k=0                k=j+1
                                                       S
                The long-run average cost per week equals  
 j=0  c(j)π j with probability 1. In
                                                                           λ /k! by
                evaluating this expression, it is convenient to replace  
 ∞  (j − k) e −λ k
                                                             k=j+1
                       
 j          −λ k
                j − λ −     (j − k) e  λ /k! in the expression for c(j). Note that by taking
                         k=0
                b = 1 and K = h = 0, the long-run average cost per week reduces to the long-
                run average demand lost per week. Dividing this average by the average weekly
                demand λ we get the long-run fraction of demand that is lost.
                Example 3.3.1 An insurance problem

                A transport firm has effected an insurance contract for a fleet of vehicles. The
                premium payment is due at the beginning of each year. There are four possible
                premium classes with a premium payment of P i in class i, where P i+1 < P i for
                i = 1, 2, 3. The size of the premium depends on the previous premium and the
                claim history during the past year. If no damage is claimed in the past year and
                the previous premium is P i , the next premium payment is P i+1 (with P 5 = P 4 ,
                by convention), otherwise the highest premium P 1 is due. Since the insurance
   107   108   109   110   111   112   113   114   115   116   117