Published March 14, 2017 | Version v1
Dataset Embargoed

de-nncom-sem - Dataset of German noun-noun compounds annotated with semantic relations (properties) and prepositions

Description

Contains 8005 compounds, annotated with a semantic relation (a property) and a preposition. Each line of the file contains the following information:                 

                 col1 - the data split (if train, test, dev or test-iaa); these data splits were used to produce the results reported in chapter 7 of Dima (2019).

                 col2 - the compound, e.g. Dreiecktuch

                 col3 - the modifier - the first constituent of the word, e.g. Dreieck (note that, like in the case of Dreieck, the modifier can be a compund itself)

                 col4 - the head - the second cosntituent of the word, e.g. Tuch

                 col5 - the collapsed property (German name) - e.g. Aussehen/*; the collapsed property does not take into account the direction of the semantic relation

                 col6 - the individual property (German name) - e.g. Aussehen; the individual property does take into account the direction of the semantic relation

                 col7 - the direction - can be 1 (read as modifier relation head - e.g. Träger Teil Rock for Trägerrock) or 2 (read as head relation modifier, e.g. Mitte Teil Brücke or Brücke Teil* Mitte for for Brückenmitte).

                 col 8 - the English name of the collapsed property, e.g. appearance/*

                 col 9 - the English name of the individual property, e.g. appearance

                 col 10 - the German preposition associated with the compound

The annotations were created in the A3 project of the SFB 833 at the University of Tübingen. See Telljohann et al. (2017) for a description of the annotation guidelines.

Files

Embargoed

The files will be made publicly available on September 30, 2026.

Reason: Researcher has not attributed a public licence to his/her research data.

Additional details

Funding

Deutsche Forschungsgemeinschaft
SFB 833: Emergence of Meaning: The Dynamics and Adaptivity of Linguistic Structures 75650358

Data quality

Accuracy

Not applicable.

Completeness

Not applicable.

Conformity

Not applicable.

Consistency

Not applicable.

Credibility

Not applicable.

Processability

Not applicable.

Relevance

Not applicable.

Timeliness

Not applicable.

Understandability

Not applicable.