Page 123 - Socially Intelligent Agents Creating Relationships with Computers and Robots
P. 123

106                                            Socially Intelligent Agents

                             reaches state S5, the policy chooses “delay 15”, which the agent then executes
                             autonomously. However, the exact policy generated by the MDP will depend
                             on the exact probabilities and costs used. The delay MDP thus achieves the
                             first step of Section 1’s three-step approach to the AA coordination challenge:
                             balancing individual and team rewards, costs, etc.
                               The second step of our approach requires that agents avoid rigidly commit-
                             ting to transfer-of-control decisions, possibly changing its previous autonomy
                             decisions. The MDP representation supports this by generating an autonomy
                             policy rather than an autonomy decision. The policy specifies optimal actions
                             for each state, so the agent can respond to any state changes by following the
                             policy’s specified action for the new state (as illustrated by the agent’s retaking
                             autonomy in state S5 by the policy discussed in the previous section). In this
                             respect, the agent’s AA is an ongoing process, as the agent acts according to a
                             policy throughout the entire sequence of states it finds itself in.
                               The third step of our approach arises because an agent may need to act
                             autonomously to avoid miscoordination, yet it may face significant uncertainty
                             and risk when doing so. In such cases, an agent can carefully plan a change in
                             coordination (e.g., delaying actions in the meeting scenario) by looking ahead
                             at the future costs of team miscoordination and those of erroneous actions. The
                             delay MDP is especially suitable for producing such a plan because it generates
                             policies after looking ahead at the potential outcomes. For instance, the delay
                             MDP supports reasoning that a short delay buys time for a user to respond,
                             reducing the uncertainty surrounding a costly decision, albeit at a small cost.
                               Furthermore, the lookahead in MDPs can find effective long-term solutions.
                             As already mentioned, the cost of rescheduling increases as more and more
                             such repair actions occur. Thus, even if the user is very likely to arrive at the
                             meeting in the next 5 minutes, the uncertainty associated with that particular
                             state transition may be sufficient, when coupled with the cost of subsequent
                             delays if the user does not arrive, for the delay MDP policy to specify an initial
                             15-minute delay (rather than risk three 5-minute delays).

                             5.     Evaluation of Electric Elves

                               We have used the E-Elves system within our research group at USC/ISI, 24
                             hours/day, 7 days/week, since June 1, 2000 (occasionally interrupted for bug
                             fixes and enhancements). The fact that E-Elves users were (and still are) willing
                             to use the system over such a long period and in a capacity so critical to their
                             daily lives is a testament to its effectiveness. Our MDP-based approach to AA
                             has provided much value to the E-Elves users, as attested to by the 689 meetings
                             that the agent proxies have monitored over the first six months of execution.
                             In 213 of those meetings, an autonomous rescheduling occurred, indicating a
                             substantial savings of user effort. Equally importantly, humans are also often
   118   119   120   121   122   123   124   125   126   127   128