Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Computational Linguistics Colloquium

Thursday April 26, 16:15, Seminar Room, Building 17

Information Extraction from Real-world German Text

Günter Neumann
Language Technology Lab
DFKI

At DFKI's LT lab we have developed a robust and efficient shallow NL core system for information extraction from real-world German text. The system called SMES consists of domain-independent shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. German text processing includes (among others) compound processing, high performance named entity recognition, chunk parsing based on a divide-and-conquer strategy, and shallow reference resolution. SMES has a good performance (more than 5000 words per second on standard PC environments) and high linguistic coverage.

If you would like to meet with the speaker, please contact Matt Crocker.