Computational Linguistics Colloquium
Thursday April 26, 16:15, Seminar Room, Building 17
Information Extraction from Real-world German Text
Günter NeumannLanguage Technology Lab
DFKI
At DFKI's LT lab we have developed a robust and efficient shallow NL core system for information extraction from real-world German text. The system called SMES consists of domain-independent shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. German text processing includes (among others) compound processing, high performance named entity recognition, chunk parsing based on a divide-and-conquer strategy, and shallow reference resolution. SMES has a good performance (more than 5000 words per second on standard PC environments) and high linguistic coverage.
If you would like to meet with the speaker, please contact Matt Crocker.