Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Computational Linguistics Colloquium

Monday, 9 July, 13:00
Conference Room, Building C7 4

Note: Unusual date and time!

Mapping between linguistic descriptions: CCG, PropBank and SFG

Matthew Honnibal
Language Technologies & Knowledge Management Lab,
School of Information Technologies,
University of Sydney

Despite the fact that every adult human is an expert speaker of at least one language, grammar, semantics and pragmatics remain difficult areas of study. There are many theories, and often little obvious reason to choose between two conflicting schools of thought. Computational linguistics has therefore taken a somewhat agnostic approach to these theoretical disputes, attempting to remain non-commital by producing resources that are more or less theory neutral. In order for computational linguists to make use of a linguistic framework, or for linguists to use large scale annotated corpora to refine their ideas, these theory neutral resources must be mapped into different annotation schemes. In this talk, I will give an overview of my doctoral research on performing such adaptations, with particular reference to two examples.

The first example involves updating CCGbank to include some of the semantic information in PropBank and NomBank, in order to improve its complement/adjunct distinction. This highlights the fact that adapted corpora inherit any weaknesses in the annotation of the original corpus – in this case, the fact that the function labels on the Penn Treebank are inconsistent. I also discuss the challenge posed by PropBank and NomBank style predicate-argument structure to the transparent interface between grammar and semantics CCG proposes. I suggest that this information cannot be effectively incorporated into CCG derivations, although extensions to the formalism may make this possible.

The second example illustrates a novel adaptation of the Penn Treebank according to a linguistic theory, systemic functional grammar. This example shows how even a theory from a very different linguistic tradition is susceptible to the strategy of adapting theory-neutral resources, although once again compromises must be made in the target annotation. I will also discuss my attempts to automate SFG annotation by post-processing the output of a CCG parser, in lieu of a more theoretically motivated way to enable the SFG community to annotate text automatically.

If you would like to meet with the speaker, please contact Stella Neumann.