British National Corpus
Creators
Description
BNC contains about 100 million words: 90% written, 10% orthographically transcribed spoken text. The written part of the BNC (90%) includes, for example, extracts from regional and national newspapers, specialist periodicals and journals for all ages and interests, academic books and popular fiction, published and unpublished letters and memoranda, school and university essays, among many other kinds of text. The spoken part (10%) consists of orthographic transcriptions of unscripted informal conversations (recorded by volunteers selected from different age, region and social classes in a demographically balanced way) and spoken language collected in different contexts, ranging from formal business or government meetings to radio shows and phone-ins.
Other (English)
The BNC project was carried out and is managed by the BNC Consortium, an industrial/academic consortium lead by Oxford University Press, of which the other members are major dictionary publishers Longman (now Pearson Education) and Larousse Kingfisher Chambers; academic research centres at Oxford University Computing Services (OUCS, now IT Services), the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University, and the British Library's Research and Innovation Centre.
Files
Additional details
Related works
- Is described by
- Data paper: http://www.natcorp.ox.ac.uk/corpus/index.xml?ID=intro (URL)