Junior Research Group
Computational Modelling of Discourse and Semantics

IDioms In conteXt: The IDIX Corpus

Description

The IDIX corpus is corpus of English expressions that can be used both literally and idiomatically, depending on the context (e.g. "rock the boat"). IDIX contains annotations for about 100 such expressions. For each expression, all occurrences were extracted from the BNC and annotated as 'literal' or 'non-literal'. The corpus can thus serve as a testbed for methods which aim to disambuate such expressions in context. It might also be useful for (small-scale) corpus linguistic research. The annotations will be made available as an add-on to the BNC, i.e., a BNC licence is required to make use of them.

Availability

The corpus will be made available soon.