Page 295 - A First Course In Stochastic Models

P. 295

VALUE ITERATION AND FICTITIOUS DECISIONS 289

and τ s (0) = 1/ν(i 1 , i 2 ). For action a = 1 in state s = (i 1 , i 2 , 1),

λ 1 /ν(i 1 + 1, i 2 ), v = (i 1 + 1, i 2 , 1),

λ 2 /ν(i 1 + 1, i 2 ), v = (i 1 + 1, i 2 , 2),

p sv (1) =
(i 1 + 1)µ 1 /ν(i 1 + 1, i 2 ), v = (i 1 , i 2 , 0),

i 2 µ 2 /ν(i 1 + 1, i 2 ), v = (i 1 + 1, i 2 − 1, 0).

and τ s (1) = 1/ν(i 1 + 1, i 2 ). Similarly, for action a = 1 in state (i 1 , i 2 , 2). Finally,
the one-step expected costs c s (a) are simply given by

1, s = (i 1 , i 2 , 1) and a = 0,
c s (a) = 1, s = (i 1 , i 2 , 2) and a = 0,
0, otherwise.


Value-iteration algorithm
Now, having speciﬁed the basic elements of the semi-Markov decision model, we
are in a position to formulate the value-iteration algorithm for the computation of
a (nearly) optimal acceptance rule. In the data transformation, we take
1
τ = .
λ 1 + λ 2 + c 1 µ 1 + c 2 µ 2
Using the above speciﬁcations, the value-iteration scheme becomes quite simple for
the allocation problem. Note that the expressions for the one-step transition times
τ s (a) and the one-step transition probabilities p st (a) have a common denominator
and so the ratio p st (a)/τ s (a) has a very simple form. In specifying the value-
iteration scheme (7.2.3), we distinguish between the auxiliary states (i 1 , i 2 , 0) and
the other states. In the states (i 1 , i 2 , 0) the only possible decision is to leave the
system alone. Thus

V n (i 1 , i 2 , 0) = τλ 1 V n−1 (i 1 , i 2 , 1) + τλ 2 V n−1 (i 1 , i 2 , 2) + τi 1 µ 1 V n−1 (i 1 − 1, i 2 , 0)
+ τi 2 µ 2 V n−1 (i 1 , i 2 − 1, 0) + {1 − τν(i 1 , i 2 )}V n−1 (i 1 , i 2 , 0),

where V n−1 (i 1 , i 2 , 1) = 0 when i 1 < 0 or i 2 < 0. For the states (i 1 , i 2 , 1),

V n (i 1 , i 2 , 1) = min ν(i 1 , i 2 ) + τλ 1 V n−1 (i 1 , i 2 , 1) + τλ 2 V n−1 (i 1 , i 2 , 2)
+ τi 1 µ 1 V n−1 (i 1 − 1, i 2 , 0) + τi 2 µ 2 V n−1 (i 1 , i 2 − 1, 0)

+ {1 − τν(i 1 , i 2 )}V n−1 (i 1 , i 2 , 1),
τλ 1 V n−1 (i 1 + 1, i 2 , 1) + τλ 2 V n−1 (i 1 + 1, i 2 , 2)
+ τ(i 1 + 1)µ 1 V n−1 (i 1 , i 2 , 0) + τi 2 µ 2 V n−1 (i 1 + 1, i 2 − 1, 0)

+ {1 − τν(i 1 + 1, i 2 )}V n−1 (i 1 , i 2 , 1) ,

290 291 292 293 294 295 296 297 298 299 300