Page 172 - Designing Sociable Robots
P. 172

breazeal-79017  book  March 18, 2002  14:7





                       The Behavior System                                                  153





                       and begins to search for a face, which it re-acquires when the caregiver returns (t ≈ 42).
                       Eventually, the robot habituates to the interaction with the caregiver and begins to attend
                       to a toy that the caregiver has provided (60 < t < 75). While interacting with the toy, the
                       robot displays interest and moves its eyes to follow the moving toy. Kismet soon habituates
                       to this stimulus and returns to its play-dialogue with the caregiver (75 < t < 100). A final
                       disengagement phase occurs (t ≈ 100) when the robot’s attention shifts back to the toy.

                       Regulating Vocal Exchanges
                       Kismet employs different social cues to regulate the rate of vocal exchanges. These in-
                       clude both eye movements as well as postural and facial displays. These cues encourage
                       the subjects to slow down and shorten their speech. This benefits the auditory processing
                       capabilities of the robot.
                         To investigate Kismet’s performance in engaging people in proto-dialogues, I invited
                       three naive subjects to interact with Kismet. They ranged in age from 25 to 28 years of
                       age. There were one male and two females, all professionals. They were asked simply to
                       talk to the robot. Their interactions were videorecorded for further analysis. (Similar video
                       interactions can be viewed on the accompanying CD-ROM.)
                         Often the subjects begin the session by speaking longer phrases and only using the
                       robot’s vocal behavior to gauge their speaking turn. They also expect the robot to respond
                       immediately after they finish talking. Within the first couple of exchanges, they may notice
                       that the robot interrupts them, and they begin to adapt to Kismet’s rate. They start to use
                       shorter phrases, wait longer for the robot to respond, and more carefully watch the robot’s
                       turn-taking cues. The robot prompts the other for his/her turn by craning its neck forward,
                       raising its brows, and looking at the person’s face when it’s ready for him/her to speak. It
                       will hold this posture for a few seconds until the person responds. Often, within a second
                       of this display, the subject does so. The robot then leans back to a neutral posture, assumes
                       a neutral expression, and tends to shift its gaze away from the person. This cue indicates
                       that the robot is about to speak. The robot typically issues one utterance, but it may issue
                       several. Nonetheless, as the exchange proceeds, the subjects tend to wait until prompted.
                         Before the subjects adapt their behavior to the robot’s capabilities, the robot is more likely
                       to interrupt them. There tends to be more frequent delays in the flow of “conversation,” where
                       the human prompts the robot again for a response. Often these “hiccups” in the flow appear
                       in short clusters of mutual interruptions and pauses (often over two to four speaking turns)
                       before the turns become coordinated and the flow smoothes out. By analyzing the video of
                       these human-robot “conversations,” there is evidence that people entrain to the robot (see
                       table 9.1). These “hiccups” become less frequent. The human and robot are able to carry
                       on longer sequences of clean turn transitions. At this point the rate of vocal exchange is
                       well-matched to the robot’s perceptual limitations. The vocal exchange is reasonably fluid.
   167   168   169   170   171   172   173   174   175   176   177