International Post-Graduate College
Language Technology
&
Cognitive Systems
Saarland University University of Edinburgh
 

Cross-lingual Bootstrapping of Resources for Role-Semantic Analysis

Speaker: Sebastian Pado

Institution: Saarland University

Abstract:

The difficulty and high cost of manually creating data with lexical semantic annotation, the so-called "lexical bottleneck problem", has led to a notable absence of broad-coverage lexical semantic resources for virtually all languages except English. This presentation introduces the task of automatic induction of semantic class and role information for new languages. Given that unsupervised methods are still in their infancy, but that hand-crafted resources exist for English, we propose the use of cross-lingual annotation projection, i.e., the transfer of linguistic information between corresponding sentences in parallel corpora.

We present the first application of annotation projection to the semantic domain and show how the task of semantic role projection can be phrased elegantly within an optimisation framework. The models we develop are adaptive in the sense that they do not require much linguistic knowledge, which may be unavailable for resource-poor target languages, but can incorporate it when present. Our evaluation indicates that semantic information can be induced across languages with a high degree of accuracy, at least for related languages such as German.

<< Back

Last modified: Thu, Jul 13, 2006 11:39:40 by