Page 202 - Biomimetics : Biologically Inspired Technologies
P. 202

Bar-Cohen : Biomimetics: Biologically Inspired Technologies DK3163_c006 Final Proof page 188 21.9.2005 2:56am




                    188                                     Biomimetics: Biologically Inspired Technologies

                    6.2.6 Motion Conclusion Summary

                    Physically actuated embodiment is certainly essential for robots to interface with the real world.
                    Additionally, many roboticists debate that interface with the complex, nonlinear real world is
                    important for the formation of intelligence (Brooks, 1991). Whether that is true or not, intelligence
                    and perceptive systems are of great importance for robots to be effective in the real world.


                                            6.3  BEHAVIOR, EXPRESSIVITY

                    6.3.1 Intelligence and Perception

                    The emulation of human and animal central nervous systems (CNS) stands as the most challenging
                    domain of bio-inspired robotics. While neuroscience is deciphering the mysteries of mind at
                    unprecedented rates, thanks largely to novel imaging techniques such as fMRI, many components
                    of machine perception and intelligence are coming into functional maturity. Though not nearly
                    as capable as humans, many ‘‘human-emulation’’ technologies have sprouted substantially in the last
                    decade, showing remarkable surges in functionality including face tracking, feature tracking, visual
                    biometric identification, bipedal locomotion, and semantically rich NLP (Kurzweil, 1999; Menzel,
                    2001; Bar-Cohen and Breazeal, 2003). With these tools, we can sketch crude models of simulated
                    mind in technological media. The emphasis, however, is on the word ‘‘crude’’: it must be acknow-
                    ledged that most of the mysteries of the CNS are well beyond science at this time.
                       Accordingly, machine intelligence is decidedly below that of most animals and certainly
                    humans. But our machines must be judged on their own standards. After all, a machine can
                    understand speech better than a dog can, and what’s more, the machine can speak back. Many of
                    the intelligent and perceptive systems available today have yet to be integrated into functional
                    whole. This section first considers intelligent systems as parts, and then discusses their integration
                    into a systematically emulated animal intelligence, with a focus on social intelligence.

                    6.3.1.1 Language, Ontologies, Top-Down

                    At the foundation of human–machine language interaction lie ASR, automated speech synthesis
                    (ASS), and various approaches to NLP. Although only capable of rudimentary language inter-
                    actions, machine language has recently shown a remarkable rate of progress, both in successful
                    academic research and in deployed speech solutions.
                       For many years, basic speech recognition and synthesis were major obstacles even to the most
                    elementary human–computer language interactions. However, progress in the late 1980s and 1990s
                    led to a large number of deployed speech applications, ranging from dictation software such as
                    IBM’s Viavoice to natural language ticketing and customer service agents, such as those offered by
                    ATT. Companies now marketing commercial speech applications include SpeechWorks, Sensory,
                    Nuance, and Dragon Naturally Speaking. Another highly effective system is open-source to
                    researchers: Carnegie Mellon’s Sphinx is highly functional, robust, user-independent ASR software
                    (Carnegie Mellon, website, 2002).
                       Several common features operate rather naturally; the ‘‘barge-in’’ capability allows users to
                    interrupt the system and still have speech recognized. ‘‘Rejection and keyword spotting’’ recognizes
                    a speaker’s keywords without prompts. Using Bayesian analysis, ‘‘N-best’’ sorts through possibil-
                    ities of what a speaker might have said to locate a correct word, while the statistical language
                    modeling of ‘‘N-gram’’ creates a sizable vocabulary and natural language recognition.
                       Word recognition and synthesis is only the first step toward endowing machines with humanlike
                    language intelligence. Text-to-speech (TTS) software outputs increasingly natural-sounding speech,
                    with off-the-shelf solutions including Rhetorical, Elan, Nuance, and the open-source Festival.
   197   198   199   200   201   202   203   204   205   206   207