|
Language Processing for Different Domains and Genres (WS 2009/10)
Presentation Topics & Papers
(Every bullet point is one topic!)
Genre Distinctions
- General
Biber, Douglas. 1993.
Using register-diversified corpora for general language studies.
Computational Linguistics 19:2, pp. 219
http://aclweb.org/anthology-new/J/J93/J93-2001.pdf
- Discourse
Bonnie Webber. 2009. Genre distinctions for discourse in the Penn TreeBank. Proc. of ACL-09
http://aclweb.org/anthology-new/P/P09/P09-1076.pdf
- Verb subcategorisation frequencies
Roland, D., & Jurafsky, D. (1998):
How verb subcategorization frequencies are affected by corpus choice.
Proceedings of COLING-ACL 1998 (pp. 1117-1121), Montreal, Canada.
http://aclweb.org/anthology-new/P/P98/P98-2184.pdf
And:
Roland, D., Jurafsky, D., Menn, L., Gahl, S., Elder, E., & Riddoch, C. (2000)
Verb subcategorization frequency differences between business-news and balanced corpora: The role of verb sense.
Proceedings of the Workshop on Comparing Corpora (pp. 28-34), Hong Kong, October 2000.
http://portal.acm.org/ft_gateway.cfm?id=979622&type=pdf&coll=GUIDE&dl=GUIDE&CFID=58874365&CFTOKEN=53240364
Domain Adaptation for Machine Learning
Parsing
- Re-ranking (and self-training)
David McClosky and Eugene Charniak and Mark Johnson. 2006
Reranking and self-training for parser adaptation. Proc. of ACL
http://aclweb.org/anthology-new/P/P06/P06-1043.pdf
And:
Jennifer Foster, Joachim Wagner, Djame Seddah and Josef van Genabith. 2007.
Adapting WSJ-Trained Parsers to the British National Corpus using In-domain Self-training.
Proceedings of IWPT 2007, pp.33-35, Prague, Czech Republic.
http://www.computing.dcu.ie/~jfoster/publications/foster_iwpt2007.pdf
- Self-Training
David McClosky, Eugene Charniak, and Mark Johnson. 2008.
When is Self-Training Effective for Parsing?
Proceedings of the International Conference on Computational Linguistics (COLING 2008).
http://aclweb.org/anthology/C/C08/C08-1071.pdf
- Detection of non-generalising rules
Markus Dickinson and Jennifer Foster. 2007.
Similarity Rules! Exploring Methods for Ad-Hoc Rule Detection. Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories (TLT-7 2009).
Groningen, The Netherlands.
http://www.computing.dcu.ie/~jfoster/publications/foster_tlt2009.pdf
And possibly as background reading:
Markus Dickinson (2008).
Ad Hoc Treebank Structures.
The 46th Annual Meeting of the Association for Computational Linguistics (ACL) with the Human Language Technology Conference (HLT) (ACL-08). Columbus, OH.
http://aclweb.org/anthology-new/P/P08/P08-1042.pdf
- Detecting parse reliability
Daisuke Kawahara and Kiyotaka Uchimoto. 2006.
Learning Reliability of Parses for Domain Adaptation of Dependency Parsing.
COLING-ACL 2006.
http://www.aclweb.org/anthology-new/I/I08/I08-2097.pdf
- Lexicalised parsing
Laura Rimell, Stephen Clark. 2008.
Adapting a Lexicalized-Grammar Parser to Contrasting Domains.
EMNLP 2008.
http://www.cl.cam.ac.uk/~lr346/pubs/emnlp08.pdf
Word-Sense Disambiguation
|