Data annotation through crowd-sourcing (MSc)

We (Dr. Vera Demberg and Dr. Asad Sayeed) have a thesis project opportunity for a Master's student in computational linguistics. We are interested in numerical measures of sentence processing complexity (e.g. surprisal) calculated from formal representations of semantics and syntax, and we are particularly interested in how to apply these measures to real-time user interfaces based on spoken dialogue systems. In order to do this, we need to find and annotate transcribed speech data, construct formal representations, improve and/or implement parsers, and test the processing-complexity predictions of these parsers against actual user behaviour, in that order. There are ample opportunities for projects within this programme, particularly the annotation of data through recently-developed crowdsourcing techniques. The student will get, at minimum, annotation design experience, a skill that is increasingly relevant for employment in the growing industrial field of "data science", and there is a fair amount of flexibility in selecting and developing the project. Reasonable programming skills are required. Please email asayeed@coli.uni-saarland.de and vera@coli.uni-saarland.de if you are interested in making an appointment and talking to us about this opportunity.