Published July 14, 2020
| Version v1
Dataset
Open
Prepositional phrase attachment candidates (AmbiPP)
Description
This dataset includes prepositions with their head candidates, with 3.15 candidates per preposition on average. The data set was extracted from TüBa-D/Z, using topological field distributions of prepositions and their heads. Lines have the following columns:
- TüBa-D/Z sentence number
- Preposition
- Preposition tag
- Preposition topological field
- Preposition object
- Preposition object tag
- Preposition object topological field
This is then followed by the candidate heads. For each candidate head, there are the following columns:
- Head token
- Head tag
- Head topological field
- Absolute distance from the preposition (negative means before the preposition)
- Rank distance from the preposition
- Head/non-head (1/0)
Other (English)
Research carried out in work package A03 of the SFB 833.
Files
CMDI.xml
Files
(2.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:d26d81bd8c050fc9b7fe091ce4762621
|
18.1 kB | Preview Download |
|
md5:eb045767314cf0114eae69f8513aac5f
|
2.1 MB | Download |
Additional details
Related works
- Is described by
- Data paper: https://ceur-ws.org/Vol-1779/07dekok.pdf (URL)