next up previous contents index
Next: Supported annotation tasks Up: Installation and Corpus Formats Previous: Installation and Corpus Formats   Contents   Index


Data formats

The tool expects TIGER XML [5] as input format for the corpora to be annotated. TIGER XML describes trees with node and edge labels. Secondary edges introduce full DAG power. So, the base format for the corpora to be annotated is highly flexible. The TIGERRegistry adminstration tool which is available at http://www.ims.uni-stuttgart.de/projekte/TIGER/ can be used to import other corpus formats.[*]

The structures one can annotate on top of this input syntactic structure are flat tree structures or else embedded structurs, see Section 6.5.



Aljoscha Burchardt 2007-09-04