Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes
Merel Scholman

Merel Scholman - Projects



    SFB 1102 Information Density and Linguistic Encoding (funded by the DFG)
    Project B2: Cognitive modelling of information density for discourse relations
    Principal Investigator: Vera Demberg

    The goal of this project is to learn about whether there is support for the uniform information density hypothesis at the level of discourse relations. Specifically, the project will test whether the presence of optional discourse relation markers can be explained by information density. A second hypothesis holds that discourse relation surprisal can account for processing difficulty. These hypotheses will be tested on corpora as well as in online experimental studies. The project also proposes to build a broad-coverage model of discourse relation surprisal, where each word in a text is regarded as a potential cue conveying some information about discourse relations in the text. An important aspect for building such a computational model is the cognitive plausibility of the discourse relation inventory, which will also be addressed.


    CLARIN-NL DiscAn: Towards a discourse annotation system for Dutch language corpora (funded by CLARIN-NL)
    Project leader: Ted Sanders

    The DiscAn corpus is a collection of subcorpora of Dutch language that have been annotated at the level of discourse. These subcorpora form a set of Dutch corpus analyses of coherence relations and discourse connectives that have been compiled and annotated by researchers at several universities in The Netherlands and Belgium. In the DiscAn project, funded by CLARIN-NL, this set of corpus analyses has been standardized (both in terms of raw data -- the texts -- and analyses) and opened up for further scientific research.