Compositionality and grounding for distributional semantics
The framework of distributional semantics provides unsupervised, efficient, and accurate methods for modeling meaning and meaning-context relations. We have worked on distributional methods to model word meaning in context, and evaluated them to the tasks of paraphrase ranking, word-sense disambiguation, and question answering.
There are some well-known limitations to distributional models that exclusively use word co-occurrence data as input: they cannot reliably detect antonyms such as love and hate, and do not contribute at all to the modeling of operators like negation or modalities.
To enable the broader exploitation of the distributional approach, we will further extend it in two directions:
First, we will add information about the "world" in different ways. In an ongoing collaboration with the MPI for Informatics, we use a large structured knowledge base (YAGO) as prior information for disambiguation.
Second, we will provide access to complex semantic structure (predicate-argument structure plus negation and other basic operators) by employing richly structured generative models of semantic interpretation, and inform the model with prior linguistic knowledge. This is part of a collaboration with the Cluster's Independent Research Groups led by Caroline Sporleder and Ivan Titov.