Page 104 - Designing Sociable Robots
P. 104
breazeal-79017 book March 18, 2002 14:54
The Auditory System 85
7.3 Design Issues for Recognizing Affective Intent
There are several design issues that must be addressed to successfully integrate Fernald’s
ideas into a robot like Kismet. As I have argued previously, this could provide a human
caregiver with a natural and intuitive means for communicating with and training a robotic
creature. The initial communication is at an affective level, where the caregiver socially
manipulates the robot’s affective state. For Kismet, the affective channel provides a powerful
means for modulating the robot’s behavior.
Robot aesthetics As discussed above, the perceptual task of recognizing affective in-
tent is significantly easier in infant-directed speech than in adult-directed speech. Even
human adults have a difficult time recognizing intent from adult-directed speech without
the linguistic information. It will be a while before robots have true natural language,
but the affective content of the vocalization can be extracted from prosody. Encouraging
speech on an infant-directed level places a constraint on how the robot appears physically
(chapter 5), how it moves (chapters 9, 12), and how it expresses itself (chapters 10, 11).
If the robot looks and behaves as a very young creature, people will be more likely to
treat it as such and naturally exaggerate their prosody when addressing the robot. This
manner of robot-directed speech would be spontaneous and seem quite appropriate. I
have found this typically to be the case for both men and women when interacting with
Kismet.
Real-time performance Another design constraint is that the robot be able to interpret the
vocalizationandrespondtoitatnaturalinteractiverates.Thehumancantoleratesmalldelays
(perhaps a second or so), but long delays will break the natural flow of the interaction. Long
delays also interfere with the caregiver’s ability to use the vocalization as a reinforcement
signal. Given that the reinforcement should be used to mark a specific event as good or
bad, long delays could cause the wrong action to be reinforced and confuse the training
process.
Voice as training signal People should be able to use their voice as a natural and intuitive
training signal for the robot. The human voice is quite flexible and can be used to convey
many different meanings, affective or otherwise. The robot should be able to recognize when
it is being praised and associate it with positive reinforcement. Similarly, the robot should
recognize scolding and associate it with negative reinforcement. The caregiver should be
able to acquire and direct the robot’s attention with attentional bids to the relevant aspects
of the task. Comforting speech should be soothing for the robot if it is in a distressed state,
and encouraging otherwise.

