Page 219 - Designing Sociable Robots
P. 219
breazeal-79017 book March 18, 2002 14:16
200 Chapter 11
Pitch Average Pitch Variance Max Pitch Min Pitch
280
8000 500
380
7000 260
360 6000 450
anger 240
calm 340 5000
disgust 400
fear 320 4000 220
happy 3000
sad 300 350 200
surprise 2000
280 180
1000 300
260 160
0 5 0 5 0 5 0 5 10
Pitch Range Energy Average Utterance Length Voiced Length Unvoiced Length
120 160
300 100
60
115
250 140 90
110 80 50
200
120
105 70 40
150
60
100 100
100 50 30
95
50 80 40
90 20
0 5 0 5 0 5 0 5 0 5
Figure 11.2
Plots of acoustic features of Kismet’s speech. Plots illustrate how each emotion relates to the others for each
acoustic feature. The horizontal axis simply maps an integer value to each emotion for ease of viewing (anger = 1,
calm = 2, etc.)
Kismet’s vocal quality varies with its “emotive” state as follows:
• Fearful speech is very fast with wide pitch contour, large pitch variance, very high mean
pitch, and normal intensity. I have added a slightly breathy quality to the voice as people
seem to associate it with a sense of trepidation.
• Angry speech is loud and slightly fast with a wide pitch range and high variance. I’ve
purposefully implemented a low mean pitch to give the voice a prohibiting quality. This
differs from table 11.1, but a preliminary study demonstrated a dramatic improvement
in recognition performance of naive subjects. This makes sense as it gives the voice a
threatening quality.
• Sad speech has a slower speech rate, with longer pauses than normal. It has a low mean
pitch, a narrow pitch range and low variance. It is softly spoken with a slight breathy quality.
This differs from table 11.1, but it gives the voice a tired quality. It has a pitch contour that
falls at the end.

