Computational Linguistics Colloquium
Thursday, 17 November 2011, 16:15
Conference Room, Building C7 4
Latent Feature Models for the Structure and Meaning of Text
James HendersonComputational Learning and Computational Linguistics Group, University of Geneva
Much of the meaning of text is reflected in individual words or phrases, but its full information content requires structured analyses of the syntax and semantics of natural language. Our work on methods for extracting such structured meaning representations from natural language has focused on the joint modelling of syntactic and semantic dependency structures. As is increasingly the case as research moves to more complex, deeper levels of semantic analysis, neither our domain knowledge nor the annotations in the data are sufficient to fully characterise the statistical regularities in this joint task. We have addressed this problem by developing latent variable models of structures, which allow us to postulate features without them being annotated in the data, and allow us to incorporate prior knowledge without making overly strong assumptions about the nature of the statistical regularities. We have used these models to achieve state-of-the-art results in both syntactic parsing and semantic role labelling across several languages, to improve semantic dependencies automatically transferred from translations, and to induce latent semantic features that would be useful in other tasks. These robust efficient latent variable models should, in future, allow us to incorporate increasingly sophisticated prior knowledge, learn from data with increasingly little annotation, and model increasingly complex tasks.
If you would like to meet with the speaker, please contact
Ivan Titov.