Page 294 - A First Course In Stochastic Models
P. 294

288                 SEMI-MARKOV DECISION PROCESSES

                make the assumption that the transmission times of the messages are exponen-
                tially distributed with mean 1/µ 1 for type 1 messages and with mean 1/µ 2 for
                type 2 messages.

                Formulation with fictitious decision epochs

                A straightforward formulation of the problem as a semi-Markov decision problem
                uses the arrival epochs as the only decision epochs. In such a formulation the
                vectors (p ij (a), j ∈ I) of one-step transition probabilities have many non-zero
                entries. In our specific problem this difficulty can be circumvented by including
                the service completion epochs as fictitious decision epochs in addition to the real
                decision epochs, being the arrival epochs of messages. By doing so, a transition
                from any state is always to one of at most four neighbouring states. In the approach
                with fictitious decision epochs, we take as state space

                        I = {(i 1 , i 2 , k) | i 1 , i 2 = 0, 1, . . . , c; i 1 + i 2 ≤ c; k = 0, 1, 2}.

                State (i 1 , i 2 , k) with k = 1 or 2 corresponds to the situation in which a type k
                message arrives and finds i 1 messages of type 1 and i 2 messages of type 2 being
                transmitted. The auxiliary state (i 1 , i 2 , 0) corresponds to the situation in which
                the transmission of a message is just completed and i 1 messages of type 1 and
                i 2 messages of type 2 are left behind in the system. Note that the type of the
                transmitted message is not relevant. For the states (i 1 , i 2 , k) with k = 1 or 2 the
                possible actions are denoted by

                                      0,   reject the arriving message,
                                a =
                                      1,   accept the arriving message,
                with the stipulation that a = 0 is the only feasible decision when i 1 + i 2 = c. The
                fictitious decision of leaving the system alone in the state s = (i 1 , i 2 , 0) is also
                denoted by a = 0. Thanks to the fictitious decision epochs, each transition from
                a given state is to one of at most four neighbouring states. In other words, most
                of the one-step transition probabilities are zero. Further, the transition probabilities
                are extremely easy to specify, because of the fact that min(X 1 , X 2 ) is exponentially
                distributed with mean 1/(α 1 + α 2 ) and P {X 1 < X 2 } = α 1 /(α 1 + α 2 ) when X 1
                and X 2 are independent random variables having exponential distributions with
                respective means 1/α 1 and 1/α 2 . Put for abbreviation

                                   ν(i 1 , i 2 ) = λ 1 + λ 2 + i 1 µ 1 + i 2 µ 2 .

                Then, for action a = 0 in state s = (i 1 , i 2 , k),
                                     
                                     λ 1 /ν(i 1 , i 2 ),  v = (i 1 , i 2 , 1),
                                     
                                       λ 2 /ν(i 1 , i 2 ),  v = (i 1 , i 2 , 2),
                                     
                             p sv (0) =
                                     i 1 µ 1 /ν(i 1 , i 2 ),  v = (i 1 − 1, i 2 , 0),
                                     
                                       i 2 µ 2 /ν(i 1 , i 2 ),  v = (i 1 , i 2 − 1, 0).
                                     
   289   290   291   292   293   294   295   296   297   298   299