Page 240 - A First Course In Stochastic Models
P. 240
CHAPTER 6
Discrete-Time Markov
Decision Processes
6.0 INTRODUCTION
In the previous chapters we saw that in the analysis of many operational systems
the concepts of a state of a system and a state transition are of basic impor-
tance. For dynamic systems with a given probabilistic law of motion, the simple
Markov model is often appropriate. However, in many situations with uncertainty
and dynamism, the state transitions can be controlled by taking a sequence of
actions. The Markov decision model is a versatile and powerful tool for analysing
probabilistic sequential decision processes with an infinite planning horizon. This
model is an outgrowth of the Markov model and dynamic programming. The lat-
ter concept, being developed by Bellman in the early 1950s, is a computational
approach for analysing sequential decision processes with a finite planning horizon.
The basic ideas of dynamic programming are states, the principle of optimality and
functional equations.
In fact dynamic programming is a recursion procedure for calculating optimal
value functions from a functional equation. This functional equation reflects the
principle of optimality, stating that an optimal policy has the property that what-
ever the initial state and the initial decision, the remaining decisions must constitute
an optimal policy with regard to the state resulting from the first transition. This
principle is always valid when the number of states and the number of actions
are finite. At much the same time as Bellman (1957) popularized dynamic pro-
gramming, Howard (1960) used basic principles from Markov chain theory and
dynamic programming to develop a policy-iteration algorithm for solving proba-
bilistic sequential decision processes with an infinite planning horizon. In the two
decades following the pioneering work of Bellman and Howard, the theory of
Markov decision processes has expanded at a fast rate and a powerful technology
has developed. However, in that period relatively little effort was put into applying
the quite useful Markov decision model to practical problems.
A First Course in Stochastic Models H.C. Tijms
c 2003 John Wiley & Sons, Ltd. ISBNs: 0-471-49880-7 (HB); 0-471-49881-5 (PB)

