Page 221 - Designing Sociable Robots
P. 221
breazeal-79017 book March 18, 2002 14:16
202 Chapter 11
of age, all affiliated with MIT. The subjects had very limited to no familiarity with Kismet’s
voice.
In this study, each subject first listened to an introduction spoken with Kismet’s neutral
expression. This was to acquaint the subject with Kismet’s synthesized quality of voice and
neutral affect. A series of eighteen utterances followed, covering six expressive qualities
(anger, fear, disgust, happiness, surprise, and sorrow). Within the experiment, the emotive
qualities were distributed randomly. Given the small number of subjects per study, I only
used a single presentation order per experiment. Each subject could work at his/her own
pace and control the number of presentations of each stimulus.
The three stimulus phrases were: “I’m going to the city,” “I saw your name in the pa-
per,” and “It’s happening tomorrow.” The first two test phrases were selected because Cahn
had found the word choice to have reasonably neutral affect. In a previous version of the
study, subjects reported that it was just as easy to map emotional correlates onto English
phrases as to Kismet’s randomly generated babbles. Their performance for English phrases
and Kismet’s babbles supports this. We believed it would be easier to analyze the data
to discover ways to improve Kismet’s performance if a small set of fixed English phrases
were used.
The subjects were simply asked to circle the word which best described the voice quality.
The choices were “anger,” “disgust,” “fear/panic,” “happy,” “sad,” “surprise/excited.” From
a previous iteration of the study, I found that word choice mattered. A given emotion
category can have a wide range of vocal affects. For instance, the subject could interpret
“fear” to imply “apprehensive,” which might be associated with Kismet’s whispery vocal
expression for sadness. Alternatively, it could be associated with “panic” which is a more
aroused interpretation. The results from these evaluations are summarized in table 11.9.
Overall, the subjects exhibited reasonable performance in correctly mapping Kismet’s
expressive quality with the targeted emotion. However, the expression of “fear” proved
Table 11.9
Naive subjects assessed the emotion conveyed in Kismet’s voice in a forced-choice evaluation. The emotional
qualities were recognized with reasonable performance except for “fear” which was most often confused for
“surprise/excitement.” Both expressive qualities share high arousal, so the confusion is not unexpected.
anger disgust fear happy sad surprise % correct
anger 75 15 0 0 0 10 75
disgust 21 50 4 0 25 0 50
fear 4 0 25 8 0 63 25
happy 0 4 4 67 8 17 67
sad 8 8 0 0 84 0 84
surprise 4 0 25 8 4 59 59
Forced-Choice Percentage (random = 17%)

