Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

SALSA - The SAarbrücken Lexical Semantics Annotation and Analysis Project



The aim of SALSA is to create a semantically annotated corpus resource and to investigate methods for its utilisation.


The construction of the corpus resource involves the following tasks:
  • Creating a semantic annotation schema which allows for the integration of a wide range of phenomena
  • Annotating the German 1.5M word TIGER corpus by hand with FrameNet semantic roles
  • Concurrent development of a German "FrameNet light"
  • Development of machine learning tools for supervised and unsupervised annotation of larger corpora
The semantically annotated corpus can be exploited on a whole range of different linguistic levels. Potential tasks are:
  • Lexical semantics: the automatic acquisition of approximate or preferential meaning information for very large lexica
  • Syntax: the improvement of statistical parsing techniques by training analysers on a combination of syntactic and semantic role information
  • Intelligent search: the enhancement of linguistically guided techniques for information access and extraction

More details about SALSA I can be found on the Research page.