International Post-Graduate College
Language Technology
&
Cognitive Systems
Saarland University University of Edinburgh
 

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition

Speaker: Barbara Rauch

Institution: Saarland University

Abstract:

Automatic Speech Recognition (ASR) systems nowadays achieve good results under certain conditions: when the user wears a headset, the recognition takes place in a noise-free environment, and the user trains the system on his or her voice, low word error rates (WER) of below 5% are possible.

However, for many real life applications we do not have such ideal conditions (e.g. for ASR in cars or meeting transcription). In particular, reverberation and background noise can often not be avoided, especially with distant microphones, and the resulting change in the quality of the audio signal degrades the performance of ASR systems significantly. For example, in the Aurora-4 task (2002), where various types of noise are added to speech data, the overall baseline is only 50.3% WER.

This problem is a major research issue in the ASR community, and many different approaches have been proposed to tackle it. One idea is to use microphone arrays, a method we are currently investigating in the Spoken Language Systems group at Saarland University. A recent technique for the use of such arrays is maximum likelihood beamforming, which optimises the classical speech enhancement method of beamforming for speech recognition. In this talk I will give a short overview of our activities in this area and present some recent results and ideas for future work.

<< Back

Last modified: Thu, Jul 13, 2006 11:39:40 by