Computational Linguistics Colloquium

Thursday, 2 May, 16:15, Seminar Room, Building 17

Stochastic Head-Driven Phrase Structure Grammar

Robert Malouf
Humanities Computing
Faculty of Arts
University of Groningen

Unification-based attribute-value grammar formalisms such as Head-Driven Phrase Structure Grammar have proven to be highly successful both for detailed linguistic descriptions and for practical large-scale grammar development. However, realistic applications of attribute-value grammars for natural language parsing or generation require the use of sophisticated statistical techniques for resolving ambiguities. As Abney (1997) shows, the simple rule frequency methods applied to disambiguating context free parses cannot be used for disambiguating constraint grammar parses, since they rely crucially on the independence of context-free rule applications.

One solution to the dependency problem is provided by maximum entropy models (Jaynes 1957, Berger et al. 1996, Della Pietra et al. 1997), a class of log-linear models which have proven to be very successful in general for integrating information from disparate and possibly overlapping sources, and in particular have been fruitfully applied to the problem of representing structural preferences in constraint-based grammar formalisms (Johnson et al. 1999, Collins 2000, Riezler et al. 2000, Osborne 2000).

This talk will describe ongoing work on the development of a stochastic disambiguation model for Alpino, a large scale wide coverage HPSG grammar of Dutch (Bouma, et al. 2001). Since maximum entropy models impose no unwarranted independence assumptions,they are well suited for capturing the kinds of interactions found in constraint-based grammars.

A further benefit of maximum entropy models is that they allow stochastic rule systems to be augmented with additional syntactic, semantic, and pragmatic features. However, the richness of the representations is not without cost: even modest maximum entropy models can require considerable computational resources and very large quantities of annotated training data in order to accurately estimate the model's parameters. Thus, highly efficient scalable methods are required for estimating the parameters of practical models.

If you would like to meet with the speaker, please contact Detlef Prescher.