There is a newer version of the record available.

Published November 21, 2012 | Version 2.0
Dataset Restricted

OntoNotes Release 2.0

Description

The corpus contains 400k words of Chinese newswire data (from Xinhua News Agency and Sinorama Magazine) and 300k words of English newswire data (from the Wall Street Journal). OntoNotes Release 2.0 adds the following to the corpus: 274k words of Chinese broadcast news data (from China Broadcating System, China Central TV, China National Radio, China Television System and Voice of America); and 200k words of English broadcast news data (from ABC, CNN, NBC, Public Radio International and Voice of America).

Files
Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Created:
May 8, 2024
Modified:
January 17, 2025