Page 123 - Socially Intelligent Agents Creating Relationships with Computers and Robots
P. 123
106 Socially Intelligent Agents
reaches state S5, the policy chooses “delay 15”, which the agent then executes
autonomously. However, the exact policy generated by the MDP will depend
on the exact probabilities and costs used. The delay MDP thus achieves the
first step of Section 1’s three-step approach to the AA coordination challenge:
balancing individual and team rewards, costs, etc.
The second step of our approach requires that agents avoid rigidly commit-
ting to transfer-of-control decisions, possibly changing its previous autonomy
decisions. The MDP representation supports this by generating an autonomy
policy rather than an autonomy decision. The policy specifies optimal actions
for each state, so the agent can respond to any state changes by following the
policy’s specified action for the new state (as illustrated by the agent’s retaking
autonomy in state S5 by the policy discussed in the previous section). In this
respect, the agent’s AA is an ongoing process, as the agent acts according to a
policy throughout the entire sequence of states it finds itself in.
The third step of our approach arises because an agent may need to act
autonomously to avoid miscoordination, yet it may face significant uncertainty
and risk when doing so. In such cases, an agent can carefully plan a change in
coordination (e.g., delaying actions in the meeting scenario) by looking ahead
at the future costs of team miscoordination and those of erroneous actions. The
delay MDP is especially suitable for producing such a plan because it generates
policies after looking ahead at the potential outcomes. For instance, the delay
MDP supports reasoning that a short delay buys time for a user to respond,
reducing the uncertainty surrounding a costly decision, albeit at a small cost.
Furthermore, the lookahead in MDPs can find effective long-term solutions.
As already mentioned, the cost of rescheduling increases as more and more
such repair actions occur. Thus, even if the user is very likely to arrive at the
meeting in the next 5 minutes, the uncertainty associated with that particular
state transition may be sufficient, when coupled with the cost of subsequent
delays if the user does not arrive, for the delay MDP policy to specify an initial
15-minute delay (rather than risk three 5-minute delays).
5. Evaluation of Electric Elves
We have used the E-Elves system within our research group at USC/ISI, 24
hours/day, 7 days/week, since June 1, 2000 (occasionally interrupted for bug
fixes and enhancements). The fact that E-Elves users were (and still are) willing
to use the system over such a long period and in a capacity so critical to their
daily lives is a testament to its effectiveness. Our MDP-based approach to AA
has provided much value to the E-Elves users, as attested to by the 689 meetings
that the agent proxies have monitored over the first six months of execution.
In 213 of those meetings, an autonomous rescheduling occurred, indicating a
substantial savings of user effort. Equally importantly, humans are also often