CRC 441
The relationship between empiricism and theory in linguistics, which was a central point of interest in the SFB 441, was investigated by project A1 with regard to electronically available text corpora. The availability of large electronically accessible corpora opens up a new source of data, which is an important means for developing linguistic theories. The usefulness of data pooled in corpora mainly depends on the method with which they are being prepared. For this reason, the aim of this project was twofold: First, theory neutral modes of representing the data were be defined. Second, the accessibility of data was investigated. These two aims were pursued in the three sectors of the project. These sectors cluster around
- providing complex automatic annotations
- defining a query language for corpora with complex annotation schemes
- providing theory dependent representations for corpora with complex annotation schemes.