Page 240 - A First Course In Stochastic Models

P. 240

CHAPTER 6

Discrete-Time Markov

Decision Processes

6.0 INTRODUCTION

In the previous chapters we saw that in the analysis of many operational systems
the concepts of a state of a system and a state transition are of basic impor-
tance. For dynamic systems with a given probabilistic law of motion, the simple
Markov model is often appropriate. However, in many situations with uncertainty
and dynamism, the state transitions can be controlled by taking a sequence of
actions. The Markov decision model is a versatile and powerful tool for analysing
probabilistic sequential decision processes with an inﬁnite planning horizon. This
model is an outgrowth of the Markov model and dynamic programming. The lat-
ter concept, being developed by Bellman in the early 1950s, is a computational
approach for analysing sequential decision processes with a ﬁnite planning horizon.
The basic ideas of dynamic programming are states, the principle of optimality and
functional equations.
In fact dynamic programming is a recursion procedure for calculating optimal
value functions from a functional equation. This functional equation reﬂects the
principle of optimality, stating that an optimal policy has the property that what-
ever the initial state and the initial decision, the remaining decisions must constitute
an optimal policy with regard to the state resulting from the ﬁrst transition. This
principle is always valid when the number of states and the number of actions
are ﬁnite. At much the same time as Bellman (1957) popularized dynamic pro-
gramming, Howard (1960) used basic principles from Markov chain theory and
dynamic programming to develop a policy-iteration algorithm for solving proba-
bilistic sequential decision processes with an inﬁnite planning horizon. In the two
decades following the pioneering work of Bellman and Howard, the theory of
Markov decision processes has expanded at a fast rate and a powerful technology
has developed. However, in that period relatively little effort was put into applying
the quite useful Markov decision model to practical problems.

A First Course in Stochastic Models H.C. Tijms
c 2003 John Wiley & Sons, Ltd. ISBNs: 0-471-49880-7 (HB); 0-471-49881-5 (PB)

235 236 237 238 239 240 241 242 243 244 245