Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes
Nordic Graduate School Course: Modeling Information Structure in Discourse and Dialogue Processing

Modeling Information Structure in Discourse and Dialogue Processing

Ivana Kruijff-Korbayová

Nordic Graduate School, April 3-8 2005

[introduction][course overview and literature][exercises]

This course is based on its predecessor, a course I thought at ESSLLI 2004 in Nancy. See here. I have slightly modified the material.


Introduction The goal of the course is to help the participants to orient themselves in the existing work on Information Structure (IS) and to inspire them to apply it. We will explain basic notions, overview the main approaches to modelling IS, survey work employing IS in NLP systems, and sketch future challenges. Hands-on experience will be included in the form of corpus analysis or as system implementation exercises.

Information Structure (IS) concerns structural and semantic properties of utterances reflecting their communicative intentions and their relation to discourse context. Among the means to realize IS in many languages are word order, intonation and marked syntactic constructions, although languages differ in how they employ them. Modelling these phenomena and their interaction in the grammar requires understanding IS and its role in discourse. IS is therefore an important aspect of meaning at the interface between utterance and discourse, that computational models of discourse processing should take into account.

Various dichotomies are used to describe IS, e.g. Theme-Rheme, Topic-Comment/Focus, Background-Focus/Kontrast, Given-New and Contextually Bound-Nonbound. The proliferating and often under-formalized terminologies are one reason why it is difficult to orient in the existing formal and computational work on IS, and to advance the study and modeling of the interaction of IS and discourse. What is needed is further systematization of the diverse terminologies, formalization, and empirical and corpus-based studies.


Course Overview and Literature

  • Lecture 1: "Information Structure as an Inherent Aspect of Sentence Meaning".
    Motivation for IS-sensitivity in discourse and dialog processing. Introduction of basic notions of IS-partitioning. The question test for IS. IS realization means. IS semantics. Meaning differences due to IS. IS-sensitive context update.
    Slides: [PDF]
    Literature::
    Kruijff-Korbayova and Steedman: Discourse and information structure. JOLLI. 2003. [PDF (prepublication version)]
    Hajicova: Issues of sentence structure and discourse patterns. Chapter 2. 1993.
    Krifka: Focus and presupposition in dynamic semantics. 1993.
    Kruijff-Korbayova and Webber: Information Structure and the Semantics of "otherwise". ESSLLI workshop 2001. [PDF]
    Vallduvi and Vilkuna: On rheme and kontrast. Syntax and Semantics, Vol. 29. 1998.
    Vallduvi and Engdahl: The linguistic realization of information packaging. Linguistics. 1996.

  • Lecture 2: "The Praguian Topic-Focus Articulation. Givenness/Familiarity/Salience. Application of IS to Salience Modeling in Analysis and Generation."
    IS partitioning based on the Prague School approach: Topic-Focus Articulation (TFA). Basic notions: dependency-based linguistic meaning, systemic ordering vs. communicative dynamism, contextual boundness/nonboundness, stock of shared knowledge, salience/activation of entities in the SSK. Modeling salience w.r.t. TFA. Applications: IS-sensitive salience modeling for anaphora resolution/generation.
    Slides: [PDF]
    Literature:
    Hajicova: Issues of sentence structure and discourse patterns. Chapters 2 and 3. 1993.
    Sgall et al. The meaning of the sentence in its semantic and pragmatic aspects. 1996
    Firbas: Functional Sentence Perspective in Written and Spoken Communication. 1992.
    Danes: Danes: On Prague School Functionalism in Linguistics. (extract). 1995.
    Hajicova et al.: An automatic procedure for topic-focus identification. CL Journal 1995. [PDF]
    Hajicova et al.: Hierarchy of salience and discourse analysis and production. COLING 1990. [PDF]
    Hajicova et al.: Stock of shared knowledge - a tool for solving pronominal anaphora. COLING 1992. [PDF]
    Strube and Hahn: Functional centering: Grounding referential coherence in information structure. Jo. of CL. 1999. [PDF]
    Krahmer and Theune: Efficient Context-Sensitive Generation of Referring Expressions. 2002. [PS]
    Stys and Zemke. Incorporating Discourse Aspects in English-Polish MT: Towards Robust Implementation. 1995. [Zipped PS]
    Prince: Toward a taxonomy of given-new information. 1981. [PDF]
    Prince: The ZPG Letter: subjects, definiteness and information status. 1992. [PS]
    Gundel et al.: Cognitive status and the form of referring expressions in discourse. Language. 1993.
    Grosz et al. Centering Theory. CL Journal 1995.
    Walker et al. (eds.). Centering Theory in Discourse. 1998.
    Buranova et al.: Tagging of very large corpora: Topic-focus articulation. COLING 2000. [PDF]
    Hajicova and Sgall: Topic-Focus and Salience. ACL 2001. [PDF]

  • Lecture 3: "Steedman's Approach: Two Dimensions of IS. Application of IS to Intonation and to Multimodal Realization."
    Theme/Rheme and Background/Focus according to Steedman. Interpretation of IS based on alternative sets. Prevost's applications: IS use to control intonation of answers to questions; IS use in monologue generation and to control its spoken realization; IS use in text-to-speech synthesis. Cassell et al.'s applications of IS to control multimodal realization, i.e., IS correlation with gesture and with gaze.
    Slides: [PDF]
    Literature:
    Steedman: Information structure and the syntax-phonology interface. Linguistic Inquiry. 2000. [GZipped PS (prepublication version)]
    Hirschberg and Pierrehumbert. Intonational structuring of discourse. ACL 1986. [PDF]
    Hirschberg. Pitch accent in context: Predicting intonational prominence from text. AI. 1993.
    Prevost and Steedman. Generating Contextually Appropriate Intonation. EACL 1993. [PDF]
    Prevost and Steedman: Specifying intonation from context for speech synthesis. Speech Communication. 1994.
    Prevost: An Information Structural Approach to Spoken Language Generation. ACL. 1996. [PDF]
    Hiyakumuto et al. Semantic and Discourse Information for Text-to-Speech Intonation. ACL Workshop. 1997. [PDF]
    Kruijff-Korbayova et al. Producing contextually appropriate intonation in an information-state based dialogue system. EACL 2003. [PDF] [Siridus project experiment website]
    Baker, Clark and White. Synthesising Contextually Appropriate Intonation in Limited Domains. ISCA Speech Synthesis Workshop. 2004. [PDF]
    Moore, Foster, Lemon and White. Generating Tailored, Comparative Descriptions in Spoken Dialogue. FLAIRS. 2004. [PDF]
    Pelachaud et al.: Synthesizing cooperative conversation. 1998. [Soft link]
    Cassell et al.: Turn taking vs. discourse structure: How best to model multimodal conversation. 1999. [PDF]
    Cassell et al. Coordination and context-dependence in the generation of embodied conversation. INLG 2000. [PDF]

  • Lecture 4: "Vallduví's Information Packaging. File-Change Sematics of IP. Halliday's Thematic Structure vs. Information Structure. Danes' Thematic Sequences."
    Information packaging according to Vallduví: Ground: Link+Tail vs. Focus. Interpretation of IS in terms of file-change instructions. Theme/Rheme vs. Given/New partitioning according to Halliday. Theme-first principle and the thematic structure of texts. Thematic Sequences.
    Slides: [PDF]
    Literature:
    Vallduvi: The dynamics of information packaging. 1994. [PDF]
    Hendriks and Dekker. Links without Locations. Amsterdam Colloquium. 1995.
    Hoffman: Integrating "free" word order syntax and information structure. EACL 1995. [PDF]
    Hoffman: Translating into free word order languages. COLING 1996. [PDF]
    Stys and Zemke. Incorporating Discourse Aspects in English-Polish MT: Towards Robust Implementation. 1995. [Zipped PS]
    Grosz et al. Centering Theory. CL Journal 1995. [PDF]
    Halliday: Notes on transitivity and theme in English -- Part 2. Jo. of Linguistics. 1967.
    Halliday: Introduction to Functional Grammar. 1985.
    Kruijff-Korbayova et al.: Generation of contextually appropriate word order. 2002. [GZipped PS (prepublication version)]
    Danes: Functional sentence perspective and the organization of the text. 1974.

  • Lecture 5: "Wrapping Up and Looking Out"
    Comparison of theories, aligning the different terminologies. How to test claims about IS? Empirical and corpus-based studies. Evaluation of practical applications in systems. Example evaluation of intonation assignment in the Godis system. Corpus annotation with IS and/or IS-relevant notions, e.g., annotation of TFA in the Prague Dependency Treebank project. Annotation of IS-relevant features at multiple levels in the MULI project. Annotation of familiarity status.
    Slides: [PDF]
    Literature:
    Kruijff-Korbayova and Steedman: Discourse and information structure. JOLLI. 2003. [PDF (prepublication version)]
    Kruijff-Korbayova et al. Producing contextually appropriate intonation in an information-state based dialogue system. EACL 2003. [PDF] [Siridus project experiment website]
    Baker, Clark and White. Synthesising Contextually Appropriate Intonation in Limited Domains. ISCA Speech Synthesis Workshop. 2004. [PDF]
    Buranova et al.: Tagging of very large corpora: Topic-focus articulation. COLING 2000. [PDF]
    Baumann et al.: Multi-dimensional annotation of linguistic corpora for investigating information structure. NAACL/HLT Workshop 2004. [Gzipped PS]
    Nissim et al.: An Annotation Scheme for Information Status in Dialog. LREC. 2004.
    Poesio: The MATE/GNOME Scheme for Anaphoric Annotation, Revisited. SIGDIAL. 2004. [PDF] See also GNOME project.

Exercises

  • Discussion of information structure realization in Finnish, Latvian and Lithuanian
  • IS analysis of a short text (Letter from Massie.). Salience assignment based on IS.
  • Intonation control in TTS using the MARY System
  • Parsing and generating with intonation marking in OpenCCG.
  • IS assignment in sentence planning.