Junior Research Group
Computational Modelling of Discourse and Semantics

Research

Seaside: Semantic Argument Structure In DiscoursE

One research focus of our group is to bring together two active areas which both deal with `computing meaning' but currently stand more or less independently next to each other: discourse processing and computation of semantic argument structure. While discourse processing deals with modelling the meaning of multi-sentence units, theories of semantic argument structure, such as Frame Semantics, model relations within individual sentences, namely the relation between a lexical item and its semantic arguments such as agent or patient. We aim to bridge the gap between these two research areas by enriching state-of-the art discourse processing models, which are typically fairly shallow, with the deeper linguistic information encoded in the semantic argument structure of lexical items and vice versa by giving shallow semantic parsers (semantic role labellers) access to discourse information. We hope that both areas will benefit from this: semantic argument information will allow for a more sophisticated representation of discourse meaning, which will be useful for applications such as text summarisation, information extraction, or question answering; while modelling discourse context can also benefit systems which compute semantic argument structure, for example by providing prior probabilities of a particular role being realised in a given discourse context.



Detecting Non-Literal Language in Discourse

Non-literal language poses a major problem for NLP system because figurative expressions (such as idioms or metaphors) are not only abundant in natural language but also often behave idiosyncratically. The first step to interpreting such expressions correctly is to identify them reliably. However, traditional approaches which rely on lists of non-literal, idiomatic expressions only solve part of the problem since many such expressions (e.g., break the ice or drop the ball) can be used both literally an idiomatically. Hence systems need to be able to distinguish literal and non-literal meaning in a given context. We are working on models that exploit lexical cohesion to make this distinction.