Page 88 - Designing Sociable Robots
P. 88

breazeal-79017  book  March 18, 2002  14:2





                       The Vision System                                                     69





                       activation level of the location currently being attended to, strengthening bias toward other
                       locations of lesser activation.
                         The habituation function can be viewed as a feature map that initially maintains eye
                       fixation by increasing the saliency of the center of the field of view and then slowly decays
                       the saliency values of central objects until a salient off-center object causes the neck to
                       move. The habituation function is a Gaussian field G(x, y) centered in the field of view
                       with peak amplitude of 255 (to remain consistent with the other 8-bit values) and θ = 50
                       pixels. It is combined linearly with the other feature maps using the weight

                       w = W · max(−1, 1 −  t/τ)                                          (6.7)
                       where w is the weight,  t is the time since the last habituation reset, τ is a time constant, and
                       W is the maximum habituation gain. Whenever the neck moves, the habituation function
                       is reset, forcing w to W and amplifying the saliency of central objects until a time τ when
                       w = 0 and there is no influence from the habituation map. As time progresses, w decays
                       to a minimum value of −W which suppresses the saliency of central objects. In the current
                       implementation, a value of W = 10 and a time constant τ = 5 seconds is used. When the
                       robot’s neck shifts, the habituation map is reset, allowing that region to be revisited after
                       some period of time.

                       6.2 Post-Attentive Processing


                       Once the attention system has selected regions of the visual field that are potentially be-
                       haviorally relevant, more intensive computation can be applied to these regions than could
                       be applied across the whole field. Searching for eyes is one such task. Locating eyes is
                       important to us for engaging in eye contact. Eyes are searched for after the robot directs
                       its gaze to a locus of attention. By doing so, a relatively high-resolution image of the area
                       being searched is available from the narrow FoV cameras (see figure 6.5).
                         Once the target of interest has been selected, its proximity to the robot is estimated using
                       a stereo match between the two central wide FoV cameras. Proximity is an important factor
                       for interaction. Things closer to the robot should be of greater interest. It is also useful for
                       interaction at a distance. For instance, a person standing too far from Kismet for face-to-
                       face interaction may be close enough to be beckoned closer. Clearly the relevant behavior
                       (beckoning or playing) is dependent on the proximity of the human to the robot.
                       Eye detection  Detecting people’s eyes in a real-time robotic domain is computationally
                       expensive and prone to error due to the large variance in head posture, lighting conditions
                       and feature scales. Aaron Edsinger developed an approach based on successive feature
                       extraction, combined with some inherent domain constraints, to achieve a robust and fast
   83   84   85   86   87   88   89   90   91   92   93