Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Computational Linguistics Colloquium

Thursday, 26 November, 16:15
Conference Room, Building C7 4

Bayesian language models for meetings

Steve Renals
University of Edinburgh

Traditional n-gram language models are widely used in state-of-the-art large vocabulary speech recognition systems. This simple model is extremely powerful, but suffers from some limitations such as overfitting if maximum likelihood estimation is used and the lack of rich contextual knowledge sources. In this talk I shall look at the use of hierarchical non-parametric priors for language modelling including the hierarchical Pitman-Yor process, and an efficient approximation to it that we call power-law discounting. I shall also talk about a related model, the hierarchical Dirichlet process, and how it may be used to enrich language models with information related to topic and social role. I will discuss all these approaches in the context of our speech recognition system for multiparty meetings.

Joint work with Songfang Huang.

If you would like to meet with the speaker, please contact Dietrich Klakow.