This package contains a set of shell scripts which simplify tagging with the TreeTagger. The scripts have been put into the cmd subdirectory. In order to be able to call these scripts from other directories, you should replace the relative paths in the scripts with absolute paths and add the path of the cmd subdirectory to the command search path. ---------------------------------------------------- cmd/tree-tagger-english * This is a script for tagging English text. It does tokenization and tagging. The names of the files which are to be tagged are expected as arguments. If no files have been specified, input from stdin is expected. cmd/tree-tagger-french This is a script for tagging French text. It does tokenization, tagging and some error correction. It has been provided by Dr. Achim Stein from the Institut fuer Romanistik, Universitaet Stuttgart. Start the script with the -h option to get a description. cmd/tree-tagger-italian This is a script for tagging Italian text. It does tokenization and tagging. This script has been provided by Dr. Achim Stein from the Institut fuer Romanistik, Universitaet Stuttgart. Start the script with the -h option to get a description. cmd/tree-tagger-german * This is a script for tagging German text. It does tokenization, tagging and some error correction. The names of the files which are to be tagged are expected as arguments. If no files have been specified, input from stdin is expected. cmd/tagger-chunker-german * This is a script for tagging and chunking German text. It does tokenization, tagging and annotation with nominal and verbal chunks. The names of the files which are to be tagged are expected as arguments. If no files have been specified, input from stdin is expected. cmd/tree-tagger-english * Similar script for tagging and chunking English Texts. cmd/lookup.perl * You can use this pretagging script to extend the tagger lexicon without generating a new parameter file. See the script itself for more information. ---------------------------------------------------- These files are needed by the shell scripts and are also contained in this tar-file: cmd/filter-german-tags error correction script lib/english-abbreviations list of English abbreviations lib/german-abbreviations list of German abbreviations cmd/filter-chunker-output.perl reformatting of the chunker output ---------------------------------------------------- These files are needed by the shell scripts and *not* contained in this tar-file. They can be downloaded either as Linux programs or as SunOS programs at the following URL address: http://www.ims.uni-stuttgart.de/Tools/DecisionTreeTagger.html The tar files should be unpacked in the same directory as the scripts. the parameter files should be moved to the lib subdirectory and uncompressed. bin/tree-tagger the tagger program proper bin/separate-punctuation tokenizer program lib/english.par English parameter file for the tagger lib/german.par German parameter file for the tagger lib/french.par French parameter file for the tagger lib/italian.par Italian parameter file for the tagger lib/german-chunker.par German parameter file for the chunker lib/english-chunker.par English parameter file for the chunker lib/english-ctagger.par English parameter file for the tagger chunker