Page 78 - Designing Sociable Robots
P. 78
breazeal-79017 book March 18, 2002 14:1
The Physical Robot 59
Low-Level Visual Perception
Kismet’s low-level visual perception system extracts a number of features that human
infants seem to be particularly responsive toward. These low-level features were selected
for their ability to help Kismet distinguish social stimuli (i.e., people, based on skin tone,
eye detection, and motion) from non-social stimuli (i.e., toys, based on saturated color and
motion), and to interact with each in interesting ways (often modulated by the distance of the
target stimulus to the robot). There are a few perceptual abilities that serve self-protection
responses. These include detecting looming stimuli as well as potentially dangerous stimuli
(characterized by excessive motion close to the robot). We have previously reported an
overview of Kismet’s visual abilities (Breazeal et al., 2000; Breazeal & Scassellati, 1999a,b).
Kismet’s low-level visual features are as follows (in parentheses, I gratefully acknowledge
my colleagues who have implemented these perceptual abilities on Kismet):
• Highly saturated color: red, blue, green, yellow (B. Scassellati)
• Colors representative of skin tone (P. Fitzpatrick)
• Motion detection (B. Scasselatti)
• Eye detection (A. Edsinger)
Distance to target (P. Fitzpatrick)
•
Looming (P. Fitzpatrick)
•
• Threatening, very close, excessive motion (P. Fitzpatrick)
Low-Level Auditory Perception
Kismet’s low-level auditory perception system extracts a number of features that are also
useful for distinguishing people from other sound-emitting objects such as rattles and bells.
The software runs in real-time and was developed at MIT by the Spoken Language Systems
Group(www.sls.lcs.mit.edu/sls).JimGlassandLeeHetheringtonweretremendously
helpful in tailoring the code for Kismet’s specific needs and in helping port this sophisticated
speech recognition system to Kismet. The software delivers a variety of information that is
used to distinguish speech-like sounds from non-speech sounds, to recognize vocal affect,
and to regulate vocal turn-taking behavior. The phonemic information may ultimately be
used to shape the robot’s own vocalizations during imitative vocal games, and to enable
the robot to acquire a proto-language from long-term interactions with human caregivers.
Kismet’s low-level auditory features are as follows:
• Sound present
• Speech present

