International Research Training Group
Language Technology
Cognitive Systems
Saarland University University of Edinburgh

Beyond keywords: finding information more accurately and easily using natural language

Speaker: Matt Lease


While keywords are both intuitive and effective for performing simple navigational and informational web search (e.g. alaska airlines, american revolution, etc.), not all information needs are so simple. For example, on community question answering sites we find posted queries like “How have dramatic shifts in terrorists resulted in an equally dramatic shift in terrorist organizations?” or "Are concerns raised by the media justified about global warming and stem cell research?" While natural language (NL) allows users to easily express such arbitrarily complicated queries, search engines generally perform poorly on them. On the other hand, while a more effective keyword query usually exists, it is often difficult for users to find these effective keywords. Consequently, supporting complex search remains an open challenge.

I adopt the approach of allowing people to naturally express their questions and investigate how automatic retrieval for such questions can be improved. To this end, I describe a learning framework for better estimating traditional term-based retrieval models by building on recent ideas from “learning to rank”. Given examples of queries and their relevant documents, the model learns to predict effective term weights for NL queries, and the feature space can be incrementally extended from terms to modeling term interactions or other latent representations. Empirical evaluation shows this better estimation improves retrieval accuracy across several datasets.

Last modified: Fri, May 29, 2009 10:57:04 by

Valid HTML 4.01 Transitional Valid CSS!