Page 73 - Innovations in Intelligent Machines
P. 73

62     P.B. Sujit et al.
                           5.1 N-person Game Model

                           The strategies that we propose for N agents/search agents is based on a game
                           theoretical model. We use q-step look ahead planning [29], where q determines
                           the depth of the exploratory search to obtain optimal strategies.
                              The objective of the agents is to select their next action or path at time t
                           in order to maximize their benefits (that is, maximize uncertainty reduction).
                           This problem can be modelled as a N-person non-zero sum game with each
                           agent as a player and the set of paths available to each agent as the set of
                           strategies.
                              Another approach to decision-making in this situation, without the bene-
                           fit of communication between agents, is to make some assumption on the
                           behavior of other agents in the search space. So, a player/agent may consider
                           the rest of the N − 1 players to be one single player. Hence, we can model the
                           N-person game as one player playing against the rest of N − 1 players taken
                           together as a single player (a coalition of N − 1 players). Here, we describe
                           both the models.
                              The payoff to each agent can be expressed in terms of search effectiveness
                                                                                        q
                                                                                        i
                           functions. Every cell has an uncertainty value associated with it. Let P (C s i ),
                           i ∈{1, 2 ... ,N} be the set of all possible paths of length q, for an agent A i ,
                                                          j         q                   q
                                                          i        i                   i
                           emanating from cell C s i  . A path P (C s i  ) ∈P (C s i  ),j =1, 2,... , |P (C s i )|,
                                                  j
                                                  i
                           is an ordered set of cells P (C s i ), defined as,
                              j         j,1  j,2  j,3   j,q  j,k     j,1       j,k+1     j,k
                              i
                             P (C s i )= C i  ,C i  ,C i  ,...,C i  | C i  ∈C,C i  = C s i ,C i  ∈N(C i  )
                                                                                           (28)
                                                                is the current position of A i ,and
                           where, C is the collection of all cells, C s i
                               j,k                                  j,k
                           N(C i  ) is the set of all neighboring cells of C i  .
                                                               k
                              Let the uncertainty value of a cell C at time t, as perceived by A i ,be
                                k                 j                                    l
                                                 i
                           U i (C ,t). Given a path P (C s i ) of agent A i , suppose A i is at cell C at time
                                                                           l
                           t then the reduction of uncertainty associated with C , and the subsequent
                           updated value of uncertainty, is evaluated as follows:
                                                    l
                           Case 1: Only A i is in cell C at time t, then
                                                                 l
                                                      v i (t)= U i (C ,t)β i               (29)
                                                                 l
                                                   l
                                               U i (C ,t +1) = U i (C ,t)(1 − β i )        (30)
                           where, v i (t) is the benefit that agent A i would obtain (that is, the amount of
                                                                                  l
                           uncertainty reduction that it will achieve) when it visits cell C .
                                                                      l
                           Case 2: When more than one agent visits cell C at time t,let A represent
                           this set of agents. Then,
                                                                        l
                                                                 ˜ ˆ
                                                          v i (t)= ββ i U i (C ,t)         (31)
                                                                     N
                                                 l             l
                                             U i (C ,t +1) = U i (C ,t) −  v i (t)         (32)
                                                                    i=1
   68   69   70   71   72   73   74   75   76   77   78