Page 310 - A First Course In Stochastic Models
P. 310

304                 SEMI-MARKOV DECISION PROCESSES

                7.14 Consider Exercise 7.13 again, but assume now that the residency times have a Coxian-2
                distribution. Develop a value-iteration algorithm to compute the total average rejection rate
                under a fixed reservation policy. A reservation policy is characterized by two integers r 1 and
                r 2 with r 1 ≥ b 1 and r 2 ≥ b 2 . Under the reservation policy an arriving customer of type i is
                accepted only if r i or more service units are available. Verify experimentally that the total
                average rejection rate for a fixed reservation policy is nearly insensitive to the second and
                higher moments of the residency times. For the case of exponentially distributed residency
                times, take the average rejection rate of the best reservation policy and verify how close it
                is to the theoretically minimal average rejection rate.
                7.15 Consider a production/inventory system with N inventory points that share a common
                production unit. At the beginning of each period, the production unit can produce for any
                number of inventory points, with the stipulation that the total production size is restricted by
                the capacity C of the production unit. The production time is negligible for any production
                scheme. The demands at the various inventory points are independent of each other. In
                each period the demand at inventory point j is Poisson distributed with mean µ j for j =
                1, . . . , N. Excess demand is lost at any inventory point. The following costs are involved.
                The cost of producing z j units for inventory point j equals K j + c j z j for z j > 0 regardless
                of how much is produced for each of the other inventory points. In any period there is a
                holding cost of h j > 0 for each unit in stock at inventory point j at the end of the period.
                A stockout cost of p j is incurred for each unit of lost demand at inventory point j. Can you
                think of a heuristic approach based on solving N one-dimensional problems and performing
                a single policy-improvement step? This problem is based on Wijngaard (1979).


                                    BIBLIOGRAPHIC NOTES

                Semi-Markov decision processes were introduced in De Cani (1964), Howard
                (1964), Jewell (1963) and Schweitzer (1965). The semi-Markov decision model
                has many applications, especially in queueing control. The data-transformation
                method converting a semi-Markov decision model into an equivalent discrete-time
                Markov decision model was introduced in Schweitzer (1971). This uniformization
                method was used in the paper of Lippman (1975) to establish the structure of
                optimal control rules in queueing applications of continuous-time Markov decision
                processes with exponentially distributed times between the decision epochs. The
                idea of using fictitious decision epochs is also contained in this paper. The embed-
                ding idea used in Section 7.4 is adapted from De Leve et al. (1977); see also Tijms
                (1980). Embedding is especially useful for developing a tailor-made policy-iteration
                algorithm that operates on a subclass of structured policies. The heuristic approach
                of attacking a multidimensional Markov decision problem through decomposition
                and a single-improvement step goes back to Norman (1970) and has been success-
                fully applied in Krishnan and Ott (1986,1987) and Wijngaard (1979), among others.
                The heuristic solution for the dynamic routing problem from Example 7.5.1 comes
                from Krishnan and Ott (1987) and has been extended in Sassen et al. (1997) to
                the case of general service times. Other heuristic approaches to handle large-scale
                Markov decision processes are discussed in Cooper et al. (2003) and Schweitzer
                and Seidman (1985).
   305   306   307   308   309   310   311   312   313   314   315