Pattern recognition in linguistic data

We are researchers in the Computational Linguistics department (Dr. Vera Demberg and Dr. Asad Sayeed), and we have an opportunity for Bachelor's theses in statistical data processing for an application of language science. The student should either be in the Computational Linguistics or Computer Science departments or have an equivalent background. Experience with scripting languages such as Perl or Python or Ruby required, as well as object-oriented languages like Java. You should have a reasonably good grasp of regular expressions. Web design skills and Javascript are a plus, as well as experience with working with XML. You should be available to meet at least once a week with us. We work mostly in English, so you should have at least some amount of English communication skills, although we may work with data in German and other languages.

Our overall project is in applications of linguistic representations to real-world spoken dialogue systems. The work involves automated linguistic data collection and analysis, particularly the collection of linguistic patterns across transcribed data. Some of the work will involve data format design, data pipeline automation, and user interface design. Please email asayeed@coli.uni-saarland.de and vera@coli.uni-saarland.de if you are interested in making an appointment and talking to us about this opportunity.