Page 97 - Designing Sociable Robots
P. 97

breazeal-79017  book  March 18, 2002  14:2





                       78                                                               Chapter 6





                       proximity to the robot should be more salient than those further away. A stereo map would
                       also be very useful for scene segmentation to separate stimuli of interest from background.
                       This can be accomplished by using the two central wide FoV cameras.
                         Another interesting feature map to incorporate would be edge orientation. Wolfe,
                       Triesman, and others argue in favor of edge orientation as a bottom-up feature map in
                       humans. Currently, Kismet has no shape metrics to help it distinguish objects from each
                       other (such as its block from its dinosaur). Adding features to support this is an important
                       extension to the existing implementation.
                         There are no auditory bottom-up contributions. A sound localization feature map would
                       be a nice multi-modal extension (Irie, 1995). Currently, Kismet assumes that the most salient
                       person is the one who is talking to it. Often there are multiple people talking around and
                       to the robot. It is important that the robot knows who is addressing it and when. Sound
                       localization would be of great benefit here. Fortunately, there are stereo microphones on
                       Kismet’s ears that could be used for this purpose.
                         Another interesting extension would be to separate the color saliency map into individual
                       color feature maps. Kismet can preferentially direct its attention to saturated color, but not
                       specifically to green, blue, red, or yellow. Humans are capable of directing search based on
                       a specific color channel. Although Kismet has access to the average r, g, b, y components
                       of the target stimulus, it would be nice if it could keep these colors segmented (so that it
                       can distinguish a blue circle on a green background, for instance). Computing individual
                       color feature maps would be a step towards these extensions.
                         Currently there is nothing that modifies the decay rate of the habituation feature map. The
                       habituation contribution implements a primitive attention span for the robot. It would be an
                       interesting extension to have motivational factors, such as fatigue or arousal, influence the
                       habituation decay rate. Caregivers continually adjust the arousal level of their infant so that
                       the infant remains alert but not too excited (Bullowa, 1979). For Kismet, it would be interest-
                       ing if the human could adjust the robot’s attention span by keeping it at a moderate arousal
                       level. This could benefit the robot’s learning rate by maintaining a longer attention span
                       when people are around and the robot is engaged in interactions with high learning potential.
                         Kismet’s visual perceptual world consists only of what is in view of the cameras. Ulti-
                       mately, the robot should be able to construct an ego-centered saliency map of interaction
                       space. In this representation, the robot could keep track of where interesting things are
                       located, even if they are not currently in view. This will prove to be a very important repre-
                       sentation for social referencing (Siegel, 1999). If Kismet could engage in social referencing,
                       then it could look to the human for the affective assessment and then back to the event that
                       it queried the caregiver about. Chances are, the event in question and the human’s face will
                       not be in view at the same time. Hence, a representation of where interesting things are,
                       even when out of view, is an important resource.
   92   93   94   95   96   97   98   99   100   101   102