Page 39 - Designing Sociable Robots

P. 39

breazeal-79017 book March 18, 2002 13:56

20 Chapter 2

Embodied Conversation Agents

There are a number of graphics-based systems that combine natural language with an
embodied avatar (see ﬁgure 2.1 for a couple of examples). The focus is on natural, con-
versational discourse accompanied by gesture, facial expression, and so forth. The human
uses these systems to perform a task, or even to learn how to perform a task. Sometimes,
the task could simply be to communicate with others in a virtual space, a sort of animated
“chatroom” with embodied avatars (Vilhjalmsson & Cassell, 1998).
There are several fully embodied conversation agents under development at various in-
stitutions. One of the most advanced systems is Rea from the Media Lab at MIT (Cassell
et al., 2000). Rea is a synthetic real-estate agent, situated in a virtual world, that people can
query about buying property. The system communicates through speech, intonation, gaze
direction, gesture, and facial expression. It senses the location of people in the room and
recognizes a few simple gestures. Another advanced system is called Steve, under devel-
opment at USC (Rickel & Johnson, 2000). Steve is a tutoring system, where the human
is immersed in virtual reality to interact with the avatar. It supports domain-independent
capabilities to support task-oriented dialogs in 3D virtual worlds. For instance, Steve trains
people how to operate a variety of equipment on a virtual ship and guides them through
the ship to show them where the equipment is located. Cosmo, under development at North
Carolina State University, is an animated Web-based pedagogical agent for children (Lester
et al., 2000). The character inhabits the Internet Advisor, a learning environment for the
domain of Internet packet routing. Because the character interacts with children, particular

Figure 2.1
Some examples of embodied conversation agents. To the left is Rea, a synthetic real estate agent. To the right
is BodyChat, a system where online users interact via embodied animated avatars. Images courtesy of Justine
Cassell and Hannes Vilhj´almsson of the Gesture and Narrative Language Research Group. Images c MIT
Media Lab.

34 35 36 37 38 39 40 41 42 43 44