Page 122 - The Master Handbook Of Acoustics
P. 122
97
SPEECH, MUSIC, AND NOISE
6000
5000
Frequency - Hz 4000
3000
2000
1000
0
0 0.5 1.0 1.5 2.0 Seconds
Should we chase those young o u- -t-law c- -o w- - -boy - - - s
FIGURE 5-6
Sound spectrogram of a sentence spoken by a male voice. AT&T Bell Laboratories.
the analog Voder from Bell Laboratories that was demonstrated at the
World Fairs in New York (1939) and in San Francisco (1940). It took a
year to train operators to play the machine to produce simple, but rec-
ognizable, speech.
Digital Speech Synthesis
Techniques for storing human speech in computer memory and playing
it back under specified, fixed conditions are widely used. Electrical
machines of this type now talk to us in the form of language translators,
talking calculators, spelling machines, as well as telephone-information
services. We will be seeing (rather, hearing) a stream of other answer-
back applications of this technique in the days ahead, including both
storage and recall, and true speech synthesis.
It is interesting to note that to program a computer to talk, a model
of speech production is necessary and that the models of Figs. 5-3, 5-4,
and 5-5 have been applied in just this way. Figure 5-7 shows a diagram
of a digital synthesis system. A random-number generator produces the
digital equivalent of the s-like sounds for the unvoiced components. A
counter produces pulses simulating the pulses of sound of the vocal
cords for the voiced components. These are shaped by time-varying
digital filters simulating the ever-changing resonances of the vocal
tract. Special signals control each of these to form digitized speech,
which is then changed to analog form in the digital-to-analog converter.