Page 95 - Socially Intelligent Agents Creating Relationships with Computers and Robots
P. 95

78                                             Socially Intelligent Agents

                             and suggested theories (reviews of about 60 years of research can be found in
                             [2, 11]). On the other hand, AI researchers have made contributions in the fol-
                             lowing areas: emotional speech synthesis [3, 9], recognition of emotions [5],
                             and using agents for decoding and expressing emotions [12].

                             2.     Motivation
                               The project is motivated by the question of how recognition of emotions
                             in speech could be used for business. A potential application is the detection
                             of the emotional state in telephone call center conversations, and providing
                             feedback to an operator or a supervisor for monitoring purposes. Another ap-
                             plication is sorting voice mail messages according to the emotions expressed
                             by the caller.
                               Given this orientation, for this study we solicited data from people who are
                             not professional actors or actresses. We have focused on negative emotions like
                             anger, sadness and fear. We have targeted telephone quality speech (less than
                             3.4 kHz) and relied on voice signal only. This means that we have excluded
                             modern speech recognition techniques. There are several reasons to do this.
                             First, in speech recognition emotions are considered as noise that decreases
                             the accuracy of recognition. Second, although it is true that some words and
                             phrases are correlated with particular emotions, the situation usually is much
                             more complex and the same word or phrase can express the whole spectrum of
                             emotions. Third, speech recognition techniques require much better quality of
                             signal and computational power.
                               To achieve our objectives we decided to proceed in two stages: research and
                             development. The objectives of the first stage are to learn how well people rec-
                             ognize emotions in speech, to find out which features of speech signal could
                             be useful for emotion recognition, and to explore different mathematical mod-
                             els for creating reliable recognizers. The second stage objective is to create a
                             real-time recognizer for call center applications.

                             3.     Research

                               For the first stage we had to create and evaluate a corpus of emotional data,
                             evaluate the performance of people, and select data for machine learning. We
                             decided to use high quality speech data for this stage.

                             3.1     Corpus of Emotional Data

                               We asked thirty of our colleagues to record the following four short sen-
                             tences: “This is not what I expected”, “I’ll be right there”, “Tomorrow is my
                             birthday”, and “I’m getting married next week.” Each sentence was recorded
                             by every subject five times; each time, the subject portrayed one of the follow-
   90   91   92   93   94   95   96   97   98   99   100