Page 298 - A First Course In Stochastic Models
P. 298

292                 SEMI-MARKOV DECISION PROCESSES

                no more than M service requests remain in the system. By doing so, we obtain a
                semi-Markov decision formulation with the state space

                                  I = {(i, t) | 0 ≤ i ≤ M, 0 ≤ t ≤ s},

                and the action sets

                                 {a | a = 0, . . . , s},  0 ≤ i ≤ M − 1, 0 ≤ t ≤ s,
                        A(i, t) =
                                 {s},               i = M, 0 ≤ t ≤ s.
                Here action a in state (i, t) means that the number of channels turned on is adjusted
                from t to a. This semi-Markov decision formulation involves the following stip-
                ulation: if action a = s is taken in state (M, t), then the next decision epoch is
                defined as the first service completion epoch at which either M or M − 1 service
                requests are left behind. Also, if action a = s is taken in state (M, t), the ‘one-
                step’ costs incurred until the next decision epoch are defined as the sum of the
                switching cost K(t, s) and the holding and operating costs made during the time
                until the next decision epoch. Denote by the random variable T M (s) the time until
                the next decision epoch when action a = s is taken in state (M, t). The random
                variable T M (s) is the sum of two components. The first component is the time until
                the next service completion or the next arrival, whichever occurs first. The second
                component is zero if a service completion occurs first; otherwise, it is distributed
                as the time needed to reduce the number of service requests present from M +1 to
                M. The semi-Markov decision formulation with an embedded state space makes
                sense only when it is feasible to calculate the one-step expected transition times
                τ (M,t) (s) and the one-step expected costs c (M,t) (s).
                  The calculation of these quantities is easy, since service completions occur
                according to a Poisson process with rate sµ as long as all of the s channels are
                occupied. In other words, whenever M or more requests are in the system, we can
                equivalently imagine that a single ‘superchannel’ is servicing requests one at a time
                at an exponential rate of sµ. This analogy enables us to invoke the formulas (2.6.2)
                and (2.6.3). Taking n = 1 and replacing the mean µ by 1/(sµ) in these formulas,
                we find that the expected time needed to reduce the number of requests present
                from M + 1 to M, given that all channels are on, is

                                          1/(sµ)       1
                                                  =
                                        1 − λ/(sµ)   sµ − λ
                and the expected holding and operating costs incurred during the time needed to
                reduce the number of requests present from M + 1 to M, given that all channels
                are on, is
                  hM      hsµ     1      λ         rs     h(M+1) + rs      hλ
                       +           +            +       =             +          .
                sµ − λ  sµ − λ sµ   sµ(sµ − λ)    sµ − λ     sµ − λ     (sµ − λ) 2
                Here the term hM/(sµ − λ) represents the expected holding costs for the M
                service requests which are continuously present during the time needed to reduce
   293   294   295   296   297   298   299   300   301   302   303