Published March 14, 2017 | Version v1
Dataset Restricted

German compound splitting dataset

  • 1. ROR icon University of Tübingen

Description

The compounds that were used in Ma et al (2016) paper entitled "Letter Sequence Labeling for Compound Splitting". It contains both two-constituent and multi-constituent compounds. As standard evaluation also involves non-compounds, the data also include non-compounds that we used.  The data are organized into the exact same training/test/development split as in the paper.

Other (English)

Research carried out in work package A03 of the SFB 833.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Related works

Is described by
Text: http://anthology.aclweb.org/W16-2012 (URL)

Funding

Deutsche Forschungsgemeinschaft
SFB 833:  Bedeutungskonstitution - Dynamik und Adaptivität sprachlicher Strukturen 75650358

Data quality

Accuracy

Not specified.

Completeness

Not specified.

Conformity

Not specified.

Consistency

Not specified.

Credibility

Not specified.

Processability

Not specified.

Relevance

Not specified.

Timeliness

Not specified.

Understandability

Not specified.