Page 227 - Designing Sociable Robots
P. 227

breazeal-79017  book  March 18, 2002  14:16





                       208                                                             Chapter 11





                       conveyance (Fleming & Dobbs, 1999). Kismet’s ten lip postures tend toward the absolute
                       minimal set specified by Fleming and Dobbs (1999), but is reasonable given its physical
                       appearance. As the robot speaks, new lip posture targets are specified at 33 Hz. Since the
                       phonemesdonotchangethisquickly,manyofthephonemesrepeat.Thereisaninherentlimit
                       in how fast Kismet’s lip and jaw motors can move to the next commanded, so the challenge
                       of co-articulation is somewhat addressed of by the physics of the motors and mechanism.
                         Lip synchronization is only part of the equation, however. Faces are not completely still
                       when speaking, but move in synchrony to provide emphasis along with the speech. Using
                       the energy of the speech signal to animate Kismet’s face (along with the lips and jaw) greatly
                       enhances the impression that Kismet “means” what it says. For Kismet, the energy of the
                       speech signal influences the movement of its eyelids and ears. Larger speech amplitudes
                       result in a proportional widening of the eyes and downward pulse of the ears. This adds a
                       nice degree of facial emphasis to accompany the stress of the vocalization.
                         Since the speech signal influences facial animation, the emotional correlates of facial
                       posture must be blended with the animation arising from speech. How this is accomplished
                       within the face control motor system is described at length in chapter 10. The emotional
                       expression establishes the baseline facial posture about which all facial animation moves.
                       The current “emotional” state also influences the speed with which the facial actuators move
                       (lower arousal results in slower movements, higher arousal results in quicker movements).
                       In addition, emotions that correspond to higher arousal produce more energetic speech,
                       resulting in bigger amplitude swings about the expression baseline. Similarly, emotions
                       that correspond to lower arousal produce less energetic speech, which results in smaller
                       amplitudes. The end product is a highly expressive and coordinated movement of face
                       with voice. For instance, angry sounding speech is accompanied by large and quick twitchy
                       movements of the ears eyelids. This undeniably conveys agitation and irritation. In contrast,
                       sad sounding speech is accompanied by slow, droopy, listless movements of the ears and
                       eyelids. This conveys a forlorn quality that often evokes sympathy from the human observer.


                       11.6  Limitations and Extensions

                       Kismet’s expressive speech can certainly be improved. In the current implementation I
                       have only included those acoustic correlates that have a global influence on the speech
                       signal and do not require local analysis of the sentence structure. I currently modulate voice
                       quality, speech rate, pitch range, average pitch, intensity, and the global pitch contour. Data
                       from naive subjects is promising, although more could certainly be done. I have done very
                       little with changes in articulation. The precision or imprecision of articulation could be
                       enhanced by substituting voiced for unvoiced phonemes as Cahn describes in her thesis.
   222   223   224   225   226   227   228   229   230   231   232