TnT -- A Statistical Part-of-Speech Tagger
Autor: Thorsten Brants
Herausgeber:
Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the
literature, we argue that a tagger based on Markov models performs at least as well as other current approaches,
including the Maximum Entropy framework. A recent comparison has even shown that TnT performs
significantly better for the tested corpora. We describe the basic model of TnT, the techniques used for smoothing
and for handling unknown words. Furthermore, we present evaluations on two corpora.
|