Differences between revisions 2 and 3
Revision 2 as of 2010-08-15 07:06:56
Size: 2167
Comment:
Revision 3 as of 2011-11-18 15:58:17
Size: 1386
Comment:
Deletions are marked like this. Additions are marked like this.
Line 10: Line 10:

== Text corpora ==
||'''Corpus name''' ||'''Description''' ||'''CQP''' ||'''Source''' ||'''Contact''' ||
||[[TextCorpora/Reuters|Reuters]] ||This is distributed on two CDs and contains about 810,000 Reuters, English Language News stories. It requires about 2.5 GB for storage of the uncompressed files. || ||[[http://trec.nist.gov/data/reuters/reuters.html|Reuters Corpora @ NIST|class=white]] || ||
||[[TextCorpora/GermanWikipedia|German Wikipedia]] ||German Wikipedia articles || (./) ||[[http://www.de.wikipedia.org|http://www.de.wikipedia.org|class=white]] ||LukasMichelbacher ||
||[[TextCorpora/EnglishWikipedia|English Wikipedia]] ||English Wikipedia articles || (./) ||[[http://www.en.wikipedia.org|http://www.en.wikipedia.org|class=white]]||LukasMichelbacher ||

Resources of the StatNLP group


Tools

TreeTagger

The TreeTagger is a tool for automatic annotation of text corpora with part-of-speech and lemma information.

RFTagger

The RFTagger is a POS tagger for fine-grained POS tagsets.

SFST

SFST is a toolbox for the implementation of morphological analysers and other programs which are based on finite state transducers.

SMOR

SMOR is a German finite-state morphology implemented in the SFST programming language. An older version of SMOR with a few sample lexicon entries comes with the SFST tools (see above).

LoPar

LoPar is a parser for head-lexicalized probabilistic context-free grammars.

BitPar

BitPar is an efficient parser for Treebank grammars.

Trace Parser

BitPar-based English parser which generates analyses with traces

YAP

YAP is a fast parser for feature-based grammars.

VPF

VPF is a parse forest browser for feature-structure based grammars.

Back to StatNLP Group

extern/StatNLPResources (last edited 2013-03-01 08:45:27 by HinrichSchuetze)