OntoNotes Release 2.0
- Creators
- Linguistic Data Consortium
Description
The corpus contains 400k words of Chinese newswire data (from Xinhua News Agency and Sinorama Magazine) and 300k words of English newswire data (from the Wall Street Journal). OntoNotes Release 2.0 adds the following to the corpus: 274k words of Chinese broadcast news data (from China Broadcating System, China Central TV, China National Radio, China Television System and Voice of America); and 200k words of English broadcast news data (from ABC, CNN, NBC, Public Radio International and Voice of America).
Additional details
- Accuracy
Not specified.
- Completeness
Not specified.
- Conformity
Not specified.
- Consistency
Not specified.
- Credibility
Not specified.
- Processability
Not specified.
- Relevance
Not specified.
- Timeliness
Not specified.
- Understandability
Not specified.