Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7Universität des Saarlandes

SemEval 2010 Task 10

Info for participants

The data sets are available in two different formats:

Below, we provide some information about the two different formats. For more information (including the README files for the two data distributions) see the download page.

FrameNet-style Annotation

The data sets were originally annotated in the FrameNet framework and then semi-automatically converted to a PropBank-style annotation. The FrameNet annotations are released in SalsaTiger-XML. More information about this XML format can be found in the following paper:

Katrin Erk and Sebastian Padó: A powerful and versatile XML Format for representing role-semantic annotation. Proceedings of LREC-2004, Lisbon.

Data files in SalsaTiger-XML format can be viewed with the SALTO annotation tool (see the SALTO documentation and download pages). An example of how our annotation looks when displayed by SALTO can be found here. An example of the SalsaTiger-XML format can be found here.

Further information on SALTO can be found in the following paper:

A. Burchardt, K. Erk, A. Frank, A. Kowalski and S. Padó SALTO -- A Versatile Multi-Level Annotation Tool. Proceedings of LREC 2006, Genoa, Italy.

SALTO Screenshot of FrameNet-style annotation

Below are two screenshots of our annotation tool. Note that these are two consecutive sentences from a sample text, which exemplify the density of the annotation and the cross-sentence links that are present.

Sentence_1Sentence_2

SalsaTiger-XML Format

Below is an example of what the sentence in the first of the above two screenshots looks like in our SalsaTIGER-xml format for the FN version of the task.

<s id="tiger_7">
<graph root="tiger_7_517">
<terminals>
<t id="tiger_7_0" word="Through" pos="IN" lemma="through"/>
<t id="tiger_7_1" word="the" pos="DT" lemma="the"/>
<t id="tiger_7_2" word="fogged" pos="JJ" lemma="fogged"/>
<t id="tiger_7_3" word="glass" pos="NN" lemma="glass"/>
<t id="tiger_7_4" word="I" pos="PRP" lemma="I"/>
<t id="tiger_7_5" word="dimly" pos="RB" lemma="dimly"/>
<t id="tiger_7_6" word="saw" pos="VBD" lemma="see"/>
<t id="tiger_7_7" word="a" pos="DT" lemma="a"/>
<t id="tiger_7_8" word="man" pos="NN" lemma="man"/>
<t id="tiger_7_9" word="spring" pos="VB" lemma="spring"/>
<t id="tiger_7_10" word="up" pos="RP" lemma="up"/>
<t id="tiger_7_11" word="from" pos="IN" lemma="from"/>
<t id="tiger_7_12" word="a" pos="DT" lemma="a"/>
<t id="tiger_7_13" word="chair" pos="NN" lemma="chair"/>
<t id="tiger_7_14" word="beside" pos="IN" lemma="beside"/>
<t id="tiger_7_15" word="the" pos="DT" lemma="the"/>
<t id="tiger_7_16" word="fire" pos="NN" lemma="fire"/>
<t id="tiger_7_17" word="," pos="PUNC," lemma=","/>
<t id="tiger_7_18" word="and" pos="CC" lemma="and"/>
<t id="tiger_7_19" word="heard" pos="VBD" lemma="hear"/>
<t id="tiger_7_20" word="a" pos="DT" lemma="a"/>
<t id="tiger_7_21" word="sharp" pos="JJ" lemma="sharp"/>
<t id="tiger_7_22" word="cry" pos="NN" lemma="cry"/>
<t id="tiger_7_23" word="from" pos="IN" lemma="from"/>
<t id="tiger_7_24" word="within" pos="IN" lemma="within"/>
<t id="tiger_7_25" word="the" pos="DT" lemma="the"/>
<t id="tiger_7_26" word="room" pos="NN" lemma="room"/>
<t id="tiger_7_27" word="." pos="PUNC." lemma="."/>
</terminals>
<nonterminals>
<nt id="tiger_7_505" cat="NPB">
<edge label="-" idref="tiger_7_15"/>
<edge label="-" idref="tiger_7_16"/>
<edge label="-" idref="tiger_7_17"/>
</nt>
<nt id="tiger_7_506" cat="PP">
<edge label="-" idref="tiger_7_14"/>
<edge label="-" idref="tiger_7_505"/>
</nt>
<nt id="tiger_7_503" cat="NPB">
<edge label="-" idref="tiger_7_12"/>
<edge label="-" idref="tiger_7_13"/>
</nt>
<nt id="tiger_7_504" cat="PP">
<edge label="-" idref="tiger_7_11"/>
<edge label="-" idref="tiger_7_503"/>
</nt>
<nt id="tiger_7_507" cat="VP">
<edge label="-" idref="tiger_7_9"/>
<edge label="-" idref="tiger_7_10"/>
<edge label="-" idref="tiger_7_506"/>
<edge label="-" idref="tiger_7_504"/>
</nt>
<nt id="tiger_7_502" cat="NPB">
<edge label="-" idref="tiger_7_7"/>
<edge label="-" idref="tiger_7_8"/>
</nt>
<nt id="tiger_7_508" cat="S">
<edge label="-" idref="tiger_7_507"/>
<edge label="-" idref="tiger_7_502"/>
</nt>
<nt id="tiger_7_509" cat="VP">
<edge label="-" idref="tiger_7_5"/>
<edge label="-" idref="tiger_7_6"/>
<edge label="-" idref="tiger_7_508"/>
</nt>
<nt id="tiger_7_510" cat="NPB">
<edge label="-" idref="tiger_7_20"/>
<edge label="-" idref="tiger_7_21"/>
<edge label="-" idref="tiger_7_22"/>
</nt>
<nt id="tiger_7_511" cat="NPB">
<edge label="-" idref="tiger_7_25"/>
<edge label="-" idref="tiger_7_26"/>
<edge label="-" idref="tiger_7_27"/>
</nt>
<nt id="tiger_7_512" cat="PP">
<edge label="-" idref="tiger_7_24"/>
<edge label="-" idref="tiger_7_511"/>
</nt>
<nt id="tiger_7_513" cat="PP">
<edge label="-" idref="tiger_7_23"/>
<edge label="-" idref="tiger_7_512"/>
</nt>
<nt id="tiger_7_514" cat="NP">
<edge label="-" idref="tiger_7_510"/>
<edge label="-" idref="tiger_7_513"/>
</nt>
<nt id="tiger_7_515" cat="VP">
<edge label="-" idref="tiger_7_19"/>
<edge label="-" idref="tiger_7_514"/>
</nt>
<nt id="tiger_7_516" cat="VP">
<edge label="-" idref="tiger_7_18"/>
<edge label="-" idref="tiger_7_509"/>
<edge label="-" idref="tiger_7_515"/>
</nt>
<nt id="tiger_7_500" cat="NPB">
<edge label="-" idref="tiger_7_1"/>
<edge label="-" idref="tiger_7_2"/>
<edge label="-" idref="tiger_7_3"/>
</nt>
<nt id="tiger_7_501" cat="PP">
<edge label="-" idref="tiger_7_0"/>
<edge label="-" idref="tiger_7_500"/>
</nt>
<nt id="tiger_7_517" cat="S">
<edge label="-" idref="tiger_7_4"/>
<edge label="-" idref="tiger_7_516"/>
<edge label="-" idref="tiger_7_501"/>
</nt>
</nonterminals>
</graph>
<matches>
</matches>
<sem>
<globals>
</globals>
<frames>
<frame name="Perception_experience" id="tiger_7_f1229595851.0867">
<target>
<fenode idref="tiger_7_19"/>
</target>
<fe name="Perceiver_passive" id="tiger_7_f1229595851.0867_e1">
<fenode idref="tiger_7_4"/>
</fe>
<fe name="Phenomenon" id="tiger_7_f1229595851.0867_e2">
<fenode idref="tiger_7_510"/>
</fe>
<fe name="Ground" id="tiger_7_f1229595851.0867_e3">
<fenode idref="tiger_7_513"/>
</fe>
</frame>
<frame name="People" id="tiger_7_f1229595851.08691">
<target>
<fenode idref="tiger_7_8"/>
</target>
<fe name="Person" id="tiger_7_f1229595851.08691_e1">
<fenode idref="tiger_7_8"/>
</fe>
</frame>
<frame name="Building_subparts" id="tiger_7_f1229595851.08711">
<target>
<fenode idref="tiger_7_26"/>
</target>
<fe name="Building_part" id="tiger_7_f1229595851.08711_e1">
<fenode idref="tiger_7_26"/>
</fe>
</frame>
<frame name="Perception_experience" id="tiger_7_f1229595851.08732">
<target>
<fenode idref="tiger_7_6"/>
</target>
<fe name="Perceiver_passive" id="tiger_7_f1229595851.08732_e1">
<fenode idref="tiger_7_4"/>
</fe>
<fe name="Phenomenon" id="tiger_7_f1229595851.08732_e2">
<fenode idref="tiger_7_502"/>
</fe>
<fe name="Direction" id="tiger_7_f1229595851.08732_e3">
<fenode idref="tiger_7_501"/>
</fe>
<fe name="Phenomenon" id="tiger_7_f1229595851.08732_e4">
<fenode idref="tiger_7_507"/>
</fe>
</frame>
<frame name="Self_motion" id="tiger_7_f1229595851.08774">
<target>
<fenode idref="tiger_7_9"/>
</target>
<fe name="Self_mover" id="tiger_7_f1229595851.08774_e1">
<fenode idref="tiger_7_502"/>
</fe>
<fe name="Source" id="tiger_7_f1229595851.08774_e2">
<fenode idref="tiger_7_504"/>
<fenode idref="tiger_7_506"/>
</fe>
</frame>
<frame name="Vocalizations" id="tiger_7_f1">
<target>
<fenode idref="tiger_7_22"/>
</target>
<fe name="Location_of_sound_source" id="tiger_7_f1_e1">
<fenode idref="tiger_7_513"/>
</fe>
<fe name="Degree" id="tiger_7_f1_e2">
<fenode idref="tiger_7_21"/>
</fe>
</frame>
<frame name="Coreference" id="tiger_7_f2">
<target>
<fenode idref="tiger_7_3"/>
</target>
<fe name="Coreferent" id="tiger_7_f2_e1">
<fenode idref="tiger_6_505"/>
</fe>
<fe name="Current" id="tiger_7_f2_e2">
<fenode idref="tiger_7_500"/>
</fe>
</frame>
</frames>
<usp>
<uspframes>
</uspframes>
<uspfes>
</uspfes>
</usp>
<wordtags>
</wordtags>
</sem>
</s>

PropBank-style Annotation

The PropBank-style annotation was semi-automatically derived from the FrameNet-style annotation. It is distributed in a modified CoNLL format. An example can be found here. The README for the PropBank-style annotation provides more details regarding the format.

Community & support

We have created a Google group for the participants of Task-10.

The organizers are also on the list and will follow the discussion there. If you want to contact us about something off the list, you can write to us directly. For now, the main contact will be: contactaddress@coli.uni-sb.de