Page 298 - A First Course In Stochastic Models
P. 298
292 SEMI-MARKOV DECISION PROCESSES
no more than M service requests remain in the system. By doing so, we obtain a
semi-Markov decision formulation with the state space
I = {(i, t) | 0 ≤ i ≤ M, 0 ≤ t ≤ s},
and the action sets
{a | a = 0, . . . , s}, 0 ≤ i ≤ M − 1, 0 ≤ t ≤ s,
A(i, t) =
{s}, i = M, 0 ≤ t ≤ s.
Here action a in state (i, t) means that the number of channels turned on is adjusted
from t to a. This semi-Markov decision formulation involves the following stip-
ulation: if action a = s is taken in state (M, t), then the next decision epoch is
defined as the first service completion epoch at which either M or M − 1 service
requests are left behind. Also, if action a = s is taken in state (M, t), the ‘one-
step’ costs incurred until the next decision epoch are defined as the sum of the
switching cost K(t, s) and the holding and operating costs made during the time
until the next decision epoch. Denote by the random variable T M (s) the time until
the next decision epoch when action a = s is taken in state (M, t). The random
variable T M (s) is the sum of two components. The first component is the time until
the next service completion or the next arrival, whichever occurs first. The second
component is zero if a service completion occurs first; otherwise, it is distributed
as the time needed to reduce the number of service requests present from M +1 to
M. The semi-Markov decision formulation with an embedded state space makes
sense only when it is feasible to calculate the one-step expected transition times
τ (M,t) (s) and the one-step expected costs c (M,t) (s).
The calculation of these quantities is easy, since service completions occur
according to a Poisson process with rate sµ as long as all of the s channels are
occupied. In other words, whenever M or more requests are in the system, we can
equivalently imagine that a single ‘superchannel’ is servicing requests one at a time
at an exponential rate of sµ. This analogy enables us to invoke the formulas (2.6.2)
and (2.6.3). Taking n = 1 and replacing the mean µ by 1/(sµ) in these formulas,
we find that the expected time needed to reduce the number of requests present
from M + 1 to M, given that all channels are on, is
1/(sµ) 1
=
1 − λ/(sµ) sµ − λ
and the expected holding and operating costs incurred during the time needed to
reduce the number of requests present from M + 1 to M, given that all channels
are on, is
hM hsµ 1 λ rs h(M+1) + rs hλ
+ + + = + .
sµ − λ sµ − λ sµ sµ(sµ − λ) sµ − λ sµ − λ (sµ − λ) 2
Here the term hM/(sµ − λ) represents the expected holding costs for the M
service requests which are continuously present during the time needed to reduce