Next: Corpus Management
Up: Workflow
Previous: User model
Contents
Index
The complete annotation process
The standard annotation process includes three steps:
- Corpus creation:
- Admin extracts one or more files (sub-corpora)
to be annotated out of a given corpus.
- Standard means for this task is TIGERSearch which is incorporated into the SALSA tool and
supports queries to the corpus. The query results are written to new files (i.e. sub-corpora).
- The sub-corpora are then distributed to the annotators'
in directories.
- Annotation:
- Users find new subcorpora in their in
directories and move them to their work directories for
annotation. The annotation proceeds one file (sub-corpus) at a
time.
- The words/constituents to be annotated are chosen by the User
during annotation with the mouse.
- A file-specific list of structures to be annotated (e.g. frames
in SALSA)
- has either been defined by Admin when file was generated or
- is specified/modified during the annotation process by the User
- After the annotation is finished, the User moves the annotated
subcorpus to the out directory.
- Adjudication:
- Admin collects the annotated sub-corpora from the
Users' out directories. These are then adjudicated, i.e.
checked for correctness within the tool. Currently, two annotations
of the same sub-corpus can be merged into one final version at a
time.
More details for these steps can be found in subsequent Sections 5 (corpus creation), 6 (annotation), and 7 (Adjudication).
Next: Corpus Management
Up: Workflow
Previous: User model
Contents
Index
Aljoscha Burchardt
2007-09-04