Page 122 - The Master Handbook Of Acoustics
P. 122

97
                                                                              SPEECH, MUSIC, AND NOISE




                           6000
                           5000
                          Frequency - Hz  4000

                           3000
                           2000
                           1000
                              0
                               0          0.5         1.0         1.5        2.0 Seconds
                               Should   we  chase   those  young  o u- -t-law  c- -o w- - -boy - - - s
                                                                             FIGURE 5-6
                      Sound spectrogram of a sentence spoken by a male voice. AT&T Bell Laboratories.


                      the analog Voder from Bell Laboratories that was demonstrated at the
                      World Fairs in New York (1939) and in San Francisco (1940). It took a
                      year to train operators to play the machine to produce simple, but rec-
                      ognizable, speech.

                      Digital Speech Synthesis
                      Techniques for storing human speech in computer memory and playing
                      it back under specified, fixed conditions are widely used. Electrical
                      machines of this type now talk to us in the form of language translators,
                      talking calculators, spelling machines, as well as telephone-information
                      services. We will be seeing (rather, hearing) a stream of other answer-
                      back applications of this technique in the days ahead, including both
                      storage and recall, and true speech synthesis.
                         It is interesting to note that to program a computer to talk, a model
                      of speech production is necessary and that the models of Figs. 5-3, 5-4,
                      and 5-5 have been applied in just this way. Figure 5-7 shows a diagram
                      of a digital synthesis system. A random-number generator produces the
                      digital equivalent of the s-like sounds for the unvoiced components. A
                      counter produces pulses simulating the pulses of sound of the vocal
                      cords for the voiced components. These are shaped by time-varying
                      digital filters simulating the ever-changing resonances of the vocal
                      tract. Special signals control each of these to form digitized speech,
                      which is then changed to analog form in the digital-to-analog converter.
   117   118   119   120   121   122   123   124   125   126   127