Page 73 - Innovations in Intelligent Machines
P. 73
62 P.B. Sujit et al.
5.1 N-person Game Model
The strategies that we propose for N agents/search agents is based on a game
theoretical model. We use q-step look ahead planning [29], where q determines
the depth of the exploratory search to obtain optimal strategies.
The objective of the agents is to select their next action or path at time t
in order to maximize their benefits (that is, maximize uncertainty reduction).
This problem can be modelled as a N-person non-zero sum game with each
agent as a player and the set of paths available to each agent as the set of
strategies.
Another approach to decision-making in this situation, without the bene-
fit of communication between agents, is to make some assumption on the
behavior of other agents in the search space. So, a player/agent may consider
the rest of the N − 1 players to be one single player. Hence, we can model the
N-person game as one player playing against the rest of N − 1 players taken
together as a single player (a coalition of N − 1 players). Here, we describe
both the models.
The payoff to each agent can be expressed in terms of search effectiveness
q
i
functions. Every cell has an uncertainty value associated with it. Let P (C s i ),
i ∈{1, 2 ... ,N} be the set of all possible paths of length q, for an agent A i ,
j q q
i i i
emanating from cell C s i . A path P (C s i ) ∈P (C s i ),j =1, 2,... , |P (C s i )|,
j
i
is an ordered set of cells P (C s i ), defined as,
j j,1 j,2 j,3 j,q j,k j,1 j,k+1 j,k
i
P (C s i )= C i ,C i ,C i ,...,C i | C i ∈C,C i = C s i ,C i ∈N(C i )
(28)
is the current position of A i ,and
where, C is the collection of all cells, C s i
j,k j,k
N(C i ) is the set of all neighboring cells of C i .
k
Let the uncertainty value of a cell C at time t, as perceived by A i ,be
k j l
i
U i (C ,t). Given a path P (C s i ) of agent A i , suppose A i is at cell C at time
l
t then the reduction of uncertainty associated with C , and the subsequent
updated value of uncertainty, is evaluated as follows:
l
Case 1: Only A i is in cell C at time t, then
l
v i (t)= U i (C ,t)β i (29)
l
l
U i (C ,t +1) = U i (C ,t)(1 − β i ) (30)
where, v i (t) is the benefit that agent A i would obtain (that is, the amount of
l
uncertainty reduction that it will achieve) when it visits cell C .
l
Case 2: When more than one agent visits cell C at time t,let A represent
this set of agents. Then,
l
˜ ˆ
v i (t)= ββ i U i (C ,t) (31)
N
l l
U i (C ,t +1) = U i (C ,t) − v i (t) (32)
i=1