Page 218 - Designing Sociable Robots

P. 218

breazeal-79017 book March 18, 2002 14:16

Expressive Vocalization System 199

Table 11.8
Table of acoustic features for the three utterances.
nzpmean nzpvar pmax pmin prange egmean length voiced unvoiced

anger-city 292.5 6348.7 444.4 166.7 277.7 112.2 81 52 29
anger-moved 269.1 4703.8 444.4 160 284.4 109.8 121 91 30
anger-picture 273.2 6850.3 444.4 153.8 290.6 110.2 112 51 61
anger-average 278.3 5967.6 444.4 160.17 284.2 110.7 104.6 64.6 40
calm-city 316.8 802.9 363.6 250 113.6 102.6 85 58 27
calm-moved 304.5 897.3 363.6 266.7 96.9 103.6 124 94 30
calm-picture 302.2 1395.5 363.6 235.3 128.3 102.4 118 73 45
calm-average 307.9 1031.9 363.6 250.67 112.93 102.9 109 75 34
disgust-city 268.4 2220.0 400 173.9 226.1 102.5 124 83 41
disgust-moved 264.6 1669.2 400 190.5 209.5 101.6 173 123 50
disgust-picture 275.2 3264.1 400 137.9 262.1 102.3 157 82 75
disgust-average 269.4 2384.4 400 167.4 232.5 102.1 151.3 96 55.3
fear-city 417.0 8986.7 500 235.3 264.7 102.8 59 27 32
fear-moved 357.2 7145.5 500 160 340 102.6 89 53 36
fear-picture 388.2 8830.9 500 160 340 103.6 86 41 45
fear-average 387.4 8321.0 500 185.1 314.9 103.0 78 40.3 37.6
happy-city 388.3 5810.6 500 285.7 214.3 106.6 71 54 17
happy-moved 348.2 6188.8 500 173.9 326.1 109.2 109 78 31
happy-picture 357.7 6038.3 500 266.7 233.3 106.0 100 57 43
happy-average 364.7 6012.6 500 242.1 257.9 107.2 93.3 63 30.3
sad-city 279.8 77.9 285.7 266.7 19 98.6 88 62 26
sad-moved 276.9 90.7 285.7 266.7 19 99.1 144 93 51
sad-picture 275.5 127.2 285.7 250 35.7 98.3 138 83 55
sad-average 277.4 96.6 285.7 261.1 24.5 98.7 123.3 79.3 44
surprise-city 394.3 8219.4 500 148.1 351.9 107.5 69 49 20
surprise-moved 360.3 7156.0 500 160 340 107.8 101 84 17
surprise-picture 371.6 8355.7 500 285.7 214.3 106.7 98 54 44
surprise-average 375.4 7910.4 500 197.9 302.0 107.3 89.3 62.3 27

The results are summarized in table 11.8. The values for each feature are displayed for
each phrase with each emotive quality (including the neutral state). The averages are also
presented in the table and plotted in ﬁgure 11.2. These plots easily illustrate the relationship
of how each emotive quality modulates these acoustic features with respect to one another.
The pitch contours for each emotive quality are shown in ﬁgure 11.3. They correspond to
the utterance “It’s been moved already.”
Relating these plots with table 11.1, it is clear that many of the acoustic correlates
of emotive speech are preserved in Kismet’s speech. I have made several incremental
adjustments to the qualities of Kismet’s speech according to what was learned from subject
evaluations. The ﬁnal implementation differs in some cases from table 11.1 (as noted below),
but the results show a dramatic improvement in subject recognition performance from earlier
evaluations.

213 214 215 216 217 218 219 220 221 222 223