International Research Training Group
Language Technology
&
Cognitive Systems
Saarland University University of Edinburgh
 

Applications of Statistics in Distant Speech Recognition: Four Simple Examples

Speaker: John McDonough

Institution: Saarland University

Abstract:

Distant speech recognition (DSR) is the automatic transcription of speech without the use of a close-talking microphone. Rather, the microphone or array of microphones is typically located at a distance of several meters from the mouth of the speaker or speakers. The current state of the art for such DSR systems involves the use of statistics at all levels. In this talk, we will consider several examples of how statistics are applied to DSR. We will begin with the simplest, namely the language model. Next we will consider how the likelihoods of acoustic features are evaluated in the recognizer during the search for the best word hypothesis. These examples occur also in conventional speech recognition. These last two examples are more specific to DSR. We will describe how a statistical model can be used to determine a speaker's position based on the speech signals captured by an array of far-field microphones. In the final example, we will demonstrate how a statistical model of human speech itself can be used to remove the corruption caused by the twin banes of far-field speech capture, namely, noise and reverberation.

Last modified: Sat, Aug 09, 2008 01:48:20 by

Valid HTML 4.01 Transitional Valid CSS!