Grammar Engineering for Deep Linguistic Processing (SS 2010)

Dan Flickinger

Yi Zhang

Valia Kordoni


Description

This course provides a hands-on introduction to the techniques and tools needed for building the precise, extensible grammars required both in research and in applications. Through a combination of lectures and in-class exercises, students will investigate the implementation of constraints in morphology, syntax, and semantics, working within the unification-based lexicalist framework of Head-driven Phrase Structure Grammar.

The course is heavily based on the open source DELPH-IN grammar engineering software tool-chain, including LKB, PET, [incr tsdb()], etc.


Course Info

Time: 16-19h
Office hours: Thursday 11-13h (after email contact)
Location: CIP Room
Type: Project Seminar
Credits: 5 LP M.Sc.

Requirements

Participants should have studied grammar formalism or syntactic theory, and have a basic understanding of constraint-based grammar. Participants should also be familiar with basic *nix commands, text editing (emacs recommended), CVS handling, etc. Short tutorials will be given during the course if necessary.

The course is composed of a mixed series of lectures and lab sessions. Since this will be a highly interactive, hands-on course, active class participation and on-time submission of assignments will be viewed favorable when it comes to grading. In addition, the participants of the class will have an oral presentation of their grammar implementation, in order to fulfill the requirements of a Hauptseminar.

Schedule

# Date Topic Download
1 03.05.2010 General Introduction Slides   Exercise
2 04.05.2010 Typed Feature Structures & LKB Slides   Exercise
3 05.05.2010 Type Descrption Language (TDL) Slides   Exercise
4 06.05.2010 Grammar Matrix Slides   Exercise
5 07.05.2010 Minimal Recursion Semantics (MRS) Slides   Exercise
6 10.05.2010 Core Phenomena
(Agreement, Modification, Argument Optionality)
Slides   Exercise
7 11.05.2010 Long-Distance Dependencies Slides   Exercise
8 12.05.2010 Test Suites & Treebanks  
9 13.05.2010 (Christi Himmelfahrt)  
10 14.05.2010 Final Project  

Software Setup

LKB

Before starting LKB, make sure you have a tmp directory under your home directory. LKB uses it for caching large grammar lexicon:

 $ mkdir -p ~/tmp

To run LKB locally on a COLI machine:

 $ export DELPHINHOME=/proj/delphin
 $ emacs -l /proj/delphin/lkb/etc/dot.emacs

Then, in the emacs:

 M-x lkb

To run LKB remotely on a COLI machine, you need a working X-window server on your machine, and SSH access to the COLI network.

 $ ssh -Y login.coli.uni-saarland.de
 $ ssh -Y cluster-1.coli.uni-saarland.de
 $ export DELPHINHOME=/proj/delphin
 $ emacs -l /proj/delphin/lkb/etc/dot.emacs

Of course, you can substitute cluster-1 with any other available cluster node. Then, in the emacs:

 M-x lkb

Be sure to use -Y or -X option to enable X11 forwarding if you are running LKB remotely.

Alternatively, you may install binary builds of LKB on your local machine. Currently, Linux x86_32 and x86_64 platforms are supported with an automatic installation script. Binary builds for Solaris, Windows and MacOS (PPC-based) are also available for older versions of LKB. For Linux platform installation:

 $ mkdir ~/delphin
 $ export DELPHINHOME=~/delphin
 $ wget http://lingo.stanford.edu/ftp/etc/install
 $ chmod +x install
 $ ./install

Then follow the instructions from the install script. In case of any problem, please contact the lecturer.

CVS Repository

There is a CVS repository at COLI hosting the grammars used in the course. Following are the typical necessary settings for the CVS repository:

Accessing from COLI system

  $ cvs -d /proj/delphin/CVS co ge-ss10

Accessing from non-COLI machine

  $ cvs -d :ext:[username]@login.coli.uni-saarland.de:/proj/delphin/CVS co ge-ss10

Remember to replace [username] with your COLI login. If this does not work (for some old *nix system), try this first before the CVS commands:

  $ export CVS_RSH=ssh

Exercise Grammar Download

here

Grammar Matrix Type Hierarchy

here

References

[Copestake et al., 2005]
Ann Copestake, Dan Flickinger, Carl J. Pollard, and Ivan A. Sag. Minimal recursion semantics: an introduction. Research on Language and Computation, 3(4):281–332, 2005.

[Sag et al., 2003]
Ivan Sag, Thomas Wason and Emily Bender. Syntactic Theory: A Formal Introduction, Second Edition. CSLI, Stanford, USA, 2003.

[Copestake, 2002]
Ann Copestake. Implementing Typed Feature Structure Grammars. CSLI, Stanford, USA, 2002.

[Bender et al., 2002]
Emily Bender, Dan Flickinger, and Stephan Oepen. The Grammar Matrix: an open-source starter-kit for the rapid development of cross-linguistically consistent broad-coverage precision grammars. In Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics, pages 8–14, Taipei, Taiwan, 2002.

[Krieger and Schäfer, 1994]
Hans-Ulrich Krieger and Ulrich Schäfer. TDL - a Type Description Language for HPSG. Technical Report RR-94-37, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, 1994.

[Carpenter, 1992]
Bob Carpenter. The Logic of Typed Feature Structures. Cambridge University Press, Cambridge, UK, 1992.


Last modified: May 11 2010.

Valid HTML 4.0 Transitional Valid CSS!