Page 184 - A First Course In Stochastic Models
P. 184

TRANSIENT DISTRIBUTION OF CUMULATIVE REWARDS         177

                reason we prefer to present a simple-minded discretization approach for the general
                reward case. For fixed t > 0, let

                            R(t) = the cumulative reward earned up to time t.

                Assume that for each state j ∈ I the joint probability distribution function P {R(t) ≤
                x, X(t) = j} has a density with respect to the reward variable x (a sufficient
                condition is that r(j) > 0 for all j ∈ I). Then we can represent P {R(t) ≤ x} as
                                                  x

                               P {R(t) ≤ x} =     f j (t, y) dy,  x ≥ 0,
                                                0
                                            j∈I
                where f j (t, x) is the joint probability density of the cumulative reward up to time
                t and the state of the process at time t. The idea is to discretize the reward variable
                x and the time variable t in multiples of  , where   > 0 is chosen sufficiently
                small (the probability of more than one state transition in a time period of length
                  should be negligibly small). The discretized reward variable x can be restricted
                to multiples of   when the following assumptions are made:

                (a) the reward rates r(j) are non-negative integers,

                (b) the non-negative lump rates F jk are multiples of  .
                  For practical applications it is no restriction to make these assumptions. How do
                we compute P {R(t) ≤ x} for fixed t and x? It is convenient to assume a probability
                distribution
                                      α i = P {X(0) = i},  i ∈ I

                for the initial state of the process. In view of the probabilistic interpretation

                    f j (t, x) x ≈ P {x ≤ R(t) < x +  x, X(t) = j}  for  x small,
                we approximate for fixed   > 0 the density f j (u, y) by a discretized function

                f (τ, r). The discretized variables τ and r run through multiples of  . For fixed
                 j

                  > 0 the discretized functions f (τ, r) are defined by the recursion scheme
                                           j


                            f (τ, r) = f (τ −  , r − r(j) )(1 − ν j  )
                                       j
                             j

                                      +    f (τ −  , r − r(k)  − F kj )q kj
                                            k
                                        k =j
                for τ = 0,  , . . . , (t/ )   and r = 0,  , . . . , (x/ )   (for ease assume that x
                and t are multiples of  ). For any j ∈ I, the boundary conditions are

                                              α j / ,  r = 0,

                                   f (0, r) =
                                    j         0,      otherwise,
   179   180   181   182   183   184   185   186   187   188   189