Computational Linguistics & Phonetics Computational Linguistics & Phonetics
main

research

services

projects

publications

teaching

supervision

presentations

visits

curriculum vitae

homepage at DFKI
Geoff Pullum's Six Golden Rules for giving an academic presentation

Indicative areas for Bachelor / Master / PhD projects

Language technologies for contrastive linguistic studies, e.g., comparing collocations across languages; properties of collocations in translated language; negative polar questions in parallel corpora; syntagmatic fixedness of multi-word expressions crosslinguistically.

Information structure and grammatical interfaces, e.g., topicalisation strategies in contrast; word order phenomena.

Computational modelling of Slavic languages, in particular: grammar engineering, treebanking and resource development; interface between treebanking and practical grammar engineering; cross-linguistic and cross-formalism generalisations and comparisons.

Slavic translational equivalents of German composita: implications for grammar induction and grammar engineering, e.g., a corpus-based study Russian - German; a corpus-based study Bulgarian - German; a corpus-based study Polish - German; a corpus-based study Czech - German, etc.

Linguistically informed NLP, e.g., for the purposes of information retrieval and question answering; machine translation; dialogue systems; computer-assisted language learning; processing of structured proper nouns, named entities with internal structure, multiword expressions, collocations.


Sample Topics

    1. Developing a (Slavic) component for the DFKI IE system SProUT

  • Russian version of the information extraction system SProUT
  • Polish version of the information extraction system SProUT
  • Bulgarian version of the information extraction system SProUT

    2. Developing a (Slavic) resource grammar for DELPH-IN

  • Construction of Russian HPSG in the DELPH-IN environment combining the Grammar Matrix and a Slavic Core Grammar with corpus-based grammar elaboration; exploiting the Russian National Corpus (RNC) in Russian Ressource Grammar (RRG) engineering.
  • Construction of Bulgarian HPSG in the DELPH-IN environment combining the Grammar Matrix and a Slavic Core Grammar with corpus-based grammar elaboration; exploiting the BulTreeBank in Bulgarian Ressource Grammar (BRG) engineering.
  • Construction of Polish HPSG in the DELPH-IN environment combining the Grammar Matrix and a Slavic Core Grammar with corpus-based grammar elaboration; exploiting the IPI PAN Corpus in Polish Ressource Grammar (PRG) engineering.
  • Construction of Czech HPSG in the DELPH-IN environment combining the Grammar Matrix and a Slavic Core Grammar with corpus-based grammar elaboration; exploiting the Prague Dependency Treebank (PDT) in Czech Ressource Grammar (CzRG) engineering.

    3. Corpora and Grammatical Theory

  • Translational equivalents of German composita: implications for grammar induction and grammar engineering.
  • Linguistically informed processing of named entities with internal structure.
  • Corpus-aided building of a multi-purpose valence dictionary.

Supervsed theses

Name

Affiliation

(Working) Title

Status

Alahverdzhieva,
Katya
LCT,
Saarland Univ.
XTAG using XMG or Toward a Core Tree-Adjoining Grammar for English completed
2008
Benjamin,
Trevor
LST,
Saarland Univ.
Patterns in Usage, Patterns in Action:  'Left Dislocation' in English Conversation
completed
2008
Biehl,
Beáta
Germanistik,
Saarland Univ.
Topikalisierungsstrategien in der deutsch-ungarischen Übersetzung
completed
2008
Borisova,
Irina
LCT,
Saarland Univ.
Implementing Gerogian polypersonal agreement through the LinGO Grammar Matrix
completed 2010
Davidescu,
Adriana
Computerlinguistik,
Saarland Univ.
A corpus-based study on grammatically and lexically
motivated translation shifts
completed
2008
Feit,
Joo-Eun
Computerlinguistik,
Saarland Univ.
Named Entity Recognition in the Movie Domain completed
2009
Fromkorth,
Bettina
Computerlinguistik,
Saalanad Univ.
Diathesis Alternation Phenomena in sign-based construction grammar
completed
2008
Hristova,
Tsvetelina
PhD Programme,
Sofia Univ.
Structure and hierarchy of the lexical expressions of moral and ethical categories in medieval Slavic translated texts in
progress
Nikolova,
Sonya
ERASMUS,
Sofia Univ.
Inventory of morphosyntactic features for a contrastive computational grammar of Balkan languages in
progress
Sukhareva, Maria LST,
Saarland Univ.
Toward Implmentation of Russian Agreement Phenomena in HPSG
completed 2011
Valeva,
Diyana
ERASMUS,
Sofia Univ.
A sample cross-linguistic phraseological database including usage examples from corpora. in
progress
Yampolska,
Nadiya
LCT,
Saarland Univ.
Acoustic properties of focus in English interrogatives: comparison native and non-native realization completed
2008