Pattern recognition in linguistic data
We are researchers in the Computational Linguistics department
(Dr. Vera Demberg and Dr. Asad Sayeed), and we have an opportunity for
Bachelor's theses in
statistical data processing for an application of language
science. The student should either be in the Computational Linguistics
or Computer Science departments or have an equivalent
background. Experience with scripting languages such as Perl or Python
or Ruby required, as well as object-oriented languages like Java. You
should have a reasonably good grasp of regular expressions. Web design
skills and Javascript are a plus, as well as experience with working with
XML. You should be available to meet at least once a week with us. We
work mostly in English, so you should have at least some amount of
English communication skills, although we may work with data in German
and other languages.
Our overall project is in applications of linguistic representations
to real-world spoken dialogue systems. The work involves automated
linguistic data collection and analysis, particularly the collection
of linguistic patterns across transcribed data. Some of the work will
involve data format design, data pipeline automation, and user
interface design. Please email asayeed@coli.uni-saarland.de and
vera@coli.uni-saarland.de if you are interested in making an
appointment and talking to us about this opportunity.