Page 83 - Designing Sociable Robots
P. 83

breazeal-79017  book  March 18, 2002  14:2





                       64                                                               Chapter 6





                       is first normalized by the luminance l (a weighted average of the three input color channels):

                           255 r          255 g          255 b
                       r n =   ·     g n =    ·     b n =   ·                              (6.1)
                            3   l          3   l          3   l
                         These normalized color channels are then used to produce four opponent-color channels:
                       r = r n − (g n + b n )/2                                            (6.2)


                       g = g n − (r n + b n )/2                                            (6.3)
                       b = b n − (r n + g n )/2                                            (6.4)

                           r n + g n

                       y =        − b n − r n − g n                                        (6.5)
                              2
                         The four opponent-color channels are clamped to 8-bit values by thresholding. While
                       some research seems to indicate that each color channel should be considered individually
                       (Nothdurft, 1993), Scassellati chose to maintain all of the color information in a single fea-
                       ture map to simplify the processing requirements (as does Wolfe [1994] for more theoretical
                       reasons). The result is a two-dimensional map where pixels containing a bright, saturated
                       color component (red, green, blue, and yellow) have a greater intensity value. Kismet is
                       particularly sensitive to bright red, green, yellow, blue, and even orange. Figure 6.1 gives
                       an example of the color feature map when the robot looks at a brightly colored block.
                       Motion saliency feature maps In parallel with the color saliency computations, a second
                       processor receives input images from the frame grabber and computes temporal differences
                       to detect motion. Motion detection is performed on the wide FoV camera, which is often at
                       rest since it does not move with the eyes. The incoming image is converted to grayscale and
                       placed into a ring of frame buffers. A raw motion map is computed by passing the absolute
                       difference between consecutive images through a threshold function T :

                       M raw = T ( I t − I t−1  )                                          (6.6)
                         This raw motion map is then smoothed with a uniform 7 × 8 field. The result is a
                       binary 2-D map where regions corresponding to motion have a high intensity value. The
                       motion saliency feature map is computed at 25-30 Hz by a single 400 MHz processor node.
                       Figure 6.1 gives an example of the motion feature map when the robot looks at a toy block
                       that is being shaken.
                       Skin tone feature map Colors consistent with skin are also filtered for. This is a com-
                       putationally inexpensive means to rule out regions that are unlikely to contain faces or
   78   79   80   81   82   83   84   85   86   87   88