Published July 14, 2020 | Version v1
Dataset Open

Prepositional phrase attachment candidates (AmbiPP)

  • 1. ROR icon University of Tübingen

Description

This dataset includes prepositions with their head candidates, with 3.15 candidates per preposition on average. The data set was extracted from TüBa-D/Z, using topological field distributions of prepositions and their heads. Lines have the following columns:

  • TüBa-D/Z sentence number
  • Preposition
  • Preposition tag
  • Preposition topological field
  • Preposition object
  • Preposition object tag
  • Preposition object topological field

This is then followed by the candidate heads. For each candidate head, there are the following columns:

  • Head token
  • Head tag
  • Head topological field
  • Absolute distance from the preposition (negative means before the preposition)
  • Rank distance from the preposition
  • Head/non-head (1/0)

Other (English)

Research carried out in work package A03 of the SFB 833.

Files

CMDI.xml

Files (2.2 MB)

Name Size Download all
md5:d26d81bd8c050fc9b7fe091ce4762621
18.1 kB Preview Download
md5:eb045767314cf0114eae69f8513aac5f
2.1 MB Download

Additional details

Related works

Is described by
Data paper: https://ceur-ws.org/Vol-1779/07dekok.pdf (URL)

Funding

Deutsche Forschungsgemeinschaft
SFB 833:  Bedeutungskonstitution - Dynamik und Adaptivität sprachlicher Strukturen 75650358

Data quality

Accuracy

Not specified.

Completeness

Not specified.

Conformity

Not specified.

Consistency

Not specified.

Credibility

Not specified.

Processability

Not specified.

Relevance

Not specified.

Timeliness

Not specified.

Understandability

Not specified.