Constituent-Based Active Learning for Statistical Grammars
Marcus Becker
 
Active learning has been used to reduce the amount of manually annotated training material needed for statistical grammar induction. Prevalent methods such as uncertainty sampling or Query-by-Committee consider parse trees to be atomic from the sample selection perspective and so disregard their internal structure. We investigate methods that are based on syntactic constituents. Evaluation shows improvement of thesemethods over uncertainty sampling and Query-by-Committee.

back to IGK4 schedule