Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Computational Linguistics Colloquium

Thursday, January 31, 16:15, Building 17, Seminar Room

The Linguistic Relevance of an Annotated Corpus for French

Anne Abbeillé
Université Paris 7

This talk presents the syntactically annotated corpus for French developed at Paris 7. The corpus comprises 1 million words fully annotated and disambiguated for parts of speech, inflectional morphology, compounds and lemmas, and syntactic constituents. It is representative of contemporary normalized written French, and covers a variety of authors and subjects (economy, literature, politics, etc.), with extracts from newspapers ranging from 1989 to 93. Our goal is to provide a theory neutral, surface oriented, error-free treebank for French. We have used the corpus sofar for lexical or syntactic preferences, and explain why we think some of these results are relevant both for theoretical linguistics and psycholinguistics.

If you would like to meet with the speaker, please contact Valia Kordoni.