Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes
Lilian wanzare

Data


DeScript
[data | paper ]

A corpus of event sequence descriptions (ESDs) for different scenarios crowdsourced via Amazon Mechanical Turk. It has 40 scenarios with approximately 100 ESDs each. The corpus also has partial alignments of event descriptions that are semantically similar with respect to the given scenario.


Detecting Everyday Scenarios in Narrative texts
[data | paper]

Thiss resource contains sentence level annotations, with sentences (segments) labeled according to the scripts they instantiate. Each text was independently annotated by two annotators. For each text, the annotators had to identify segments referring to a scenario from a scenario list, and assign scenario labels. If a segment referred to more than one script, they were allowed to assign multiple labels. A scenario label could be either one of 200 scenarios or ¿None¿ to capture sentences that do not refer to any of the scenarios. The resource contains 504 documents, consisting of a total of 10,754 sentences. On average, each document is 35.74 sentences long.