DiaBruck 2003 Tutorial:
Best Practice in Empirically-based Dialogue Research

David Traum, Institute for Creative Technology, University of Southern California, Marina del Rey, California, USA
Laurent Romary, LORIA, Nancy, France
Michael Strube, EML Research gGmbH, Heidelberg, Germany


  1. Introduction (81KB)
    • Why do we (semanticists, dialogue researchers, dialogue system developers) need empirical data?
    • Why do we need to collect data (instead of making them up)?
    • Why should we use proper methods for collecting and annotating data?
    • The museum of annotation (1.75 MB): An illustrated history of annotation in the past
  2. Corpus development and use life-cycle (51KB)
  3. Corpus analysis and annotation
  4. Representation, data format, standards (661KB)
    • Stand-off annotation
    • Multi-level annotation
    • XML
    • ISO standardization
  5. Annotation tool (MMAX) (180 KB)
  6. What are the annotated data good for? (31 KB)
    • Data generation
    • Machine learning
    • Evaluation
  7. Discussion (36KB)
  8. Literature (15 KB)