Constituent-Based Active Learning
for Statistical Grammars
Marcus Becker
Active
learning has been used to reduce the amount of manually annotated
training material needed for statistical grammar induction. Prevalent
methods such as uncertainty sampling or Query-by-Committee consider
parse trees to be atomic from the sample selection perspective and so
disregard their internal structure. We investigate methods that are
based on syntactic constituents. Evaluation shows improvement of
thesemethods over uncertainty sampling and Query-by-Committee.
back to IGK4 schedule