Size: 2116
Comment:
|
Size: 1386
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 10: | Line 10: |
== Text corpora == ||'''Corpus name''' ||'''Description''' ||'''CQP''' ||'''Source''' ||'''Contact''' || ||[[TextCorpora/Reuters|Reuters]] ||This is distributed on two CDs and contains about 810,000 Reuters, English Language News stories. It requires about 2.5 GB for storage of the uncompressed files. || ||[[http://trec.nist.gov/data/reuters/reuters.html|Reuters Corpora @ NIST|class=white]] || || ||[[TextCorpora/GermanWikipedia|German Wikipedia]] ||German Wikipedia articles || (./) ||[[http://www.de.wikipedia.org|http://www.de.wikipedia.org|class=white]] ||LukasMichelbacher || ||[[TextCorpora/EnglishWikipedia|English Wikipedia]] ||English Wikipedia articles || (./) ||[[http://www.en.wikipedia.org|http://www.en.wikipedia.org|class=white]]||LukasMichelbacher || |
Back to [[extern/StatNLPGroup|StatNLP Group]] |
Resources of the StatNLP group
Tools
The TreeTagger is a tool for automatic annotation of text corpora with part-of-speech and lemma information. |
The RFTagger is a POS tagger for fine-grained POS tagsets. |
||
SFST is a toolbox for the implementation of morphological analysers and other programs which are based on finite state transducers. |
SMOR |
SMOR is a German finite-state morphology implemented in the SFST programming language. An older version of SMOR with a few sample lexicon entries comes with the SFST tools (see above). |
|
LoPar is a parser for head-lexicalized probabilistic context-free grammars. |
BitPar is an efficient parser for Treebank grammars. |
||
Trace Parser |
BitPar-based English parser which generates analyses with traces |
YAP |
YAP is a fast parser for feature-based grammars. |
VPF |
VPF is a parse forest browser for feature-structure based grammars. |
|
|
Back to StatNLP Group