IRTG Annual Meeting 2007

Making Synthetic Speech Output as Natural, Flexible and Efficient as Human Speech

Speaker: Alan W. Black

Institution:Carnegie Mellon University

Abstract:

As speech technology matures to a level where it becomes practical for human-machine communication, much greater demands are now placed on the quality of the voice output. It is no longer sufficient to simply provide an understandable voice, communication demands an appropriate voice, of course, in the appropriate language, but also in the right style, and even particular identity.

This talk will present a series of work, that describes the basic processes involved in building synthetic voices. Over the past 10 years we have developed core synthesis techniques, engines and tools to make the building process better defined and more successful. Using data-driven techniques we have refined and optimized the processes of corpus-based synthesis itself, prompt selection, automatic labeling, lexicon construction, articulatory voice conversion and evaluation techniques. Our synthesizers, Festival and Flite, and the voices constructed with the FestVox tools have been used in a large number of speech applications, including: spoken dialog systems, speech-to-speech translation, and talking heads.

<< Back

Last modified: Thu, Mar 15, 2007 11:48:06 by