Page 266 - Socially Intelligent Agents Creating Relationships with Computers and Robots

P. 266

Multi-Agent Contract Negotiation 249

800
1 local hillclimber
700
600
Annealer Hillclimber
Average Utility 400 2 local hillclimbers Hillclimber 550/550 180/700
500
Annealer
2 local simulated annealers
400/400
700/180
300
200
1 local simulated annealer
100
0
0 100 200 300 400 500 600 700 800 900 1000
Proposals
Figure 30.2. Game Dynamics (left) and Final Payoff Matrix of the Game (right)

dependently of the cost-beneﬁt margins. Therefore, at high virtual temperature
the simulated annealer is more explorative and “far sighted” because it assumes
costs now are offset by gains later. This is in contrast to the myopic nature of the
hillclimber where exploration is constrained by the monotonicity requirement.
In the asymmetric interaction the cooperation of annealers permits more explo-
ration of the contract space, and hence arrival to higher optima, of hillclimber’s
utility landscape. However, this cooperation is not reciprocated by hillclimbers
who act selﬁshly. Therefore, gains of hillclimbers are achieved at the cost of
the annealer. The right ﬁgure in ﬁgure 30.2 represents the underlying game as a
matrix of ﬁnal observed utilities for all the pairings of hillclimber and annealer
strategies. The results conﬁrm that this game is an instance of the prisoner’s
dilemma game [1], where for each agent the dominant strategy is hillclimb-
ing. Therefore, the unique dominating strategy is for both agents to hillclimb.
However, this unique dominating strategy is pareto-optimally dominated when
both are annealers. In other words, the single Nash equilibria of this game (two
hillclimbers) is the only solution not in the Pareto set.
4. Conclusions

The contracting problem was used to motivate two different heuristic and
approximateagent decisionmodels, bothbasedona realistic set of requirements
over both K and C. However, the cost of these requirements is the sub-optimality
of Q. This trade-off was demonstrated in both models by negotiation strategies
selecting outcomes that are not pareto efﬁcient. However, imperfections is a
common feature of the world and real social systems have established personal
and institutional mechanisms for dealing with such imperfections. Similarly,
in future computational models are sought that are incremental, repeated and
support feedback and error-correction. Learning and evolutionary techniques

261 262 263 264 265 266 267 268 269 270 271