Published September 20, 2004
| Version v1
Dataset
Restricted
Tübinger Partiell Geparstes Korpus des Deutschen/Schriftsprache
Creators
Description
TüPP-D/Z is a collection of articles from the taz newspaper ("die tageszeitung") which have been automatically annotated with clause structure, topological fields, and chunks, in addition to more low level annotation including parts of speech and morphological ambiguity classes. All texts have been processed automatically, starting from paragraph, sentence and token segmentation. Word forms include information about some regular types of named entities, including dates, telephone numbers, and number/unit combinations.
Files
Additional details
Additional titles
- Alternative title (English)
- Tübingen Partially Parsed Corpus of Written German