Module brown
source code
Read tokens from the Brown Corpus.
Brown Corpus: A Standard Corpus of Present-Day Edited American
English, for use with Digital Computers, by W. N. Francis and H. Kucera
(1964), Department of Linguistics, Brown University, Providence, Rhode
Island, USA. Revised 1971, Revised and Amplified 1979. Distributed with
NLTK with the permission of the copyright holder. Source:
http://www.hit.uib.no/icame/brown/bcm.html
The Brown Corpus is divided into the following files:
a. press: reportage b. press: editorial c. press: reviews d. religion
e. skill and hobbies f. popular lore g. belles-lettres h. miscellaneous:
government & house organs j. learned k: fiction: general l: fiction:
mystery m: fiction: science n: fiction: adventure p. fiction: romance r.
humor
|
|
|
raw(files=[ ' a ' , ' b ' , ' c ' , ' d ' , ' e ' , ' f ' , ' g ' , ' h ' , ' j ' , ' k ' , ' l ' , ' m ' , ' ... ) |
source code
|
|
|
tagged(files=[ ' a ' , ' b ' , ' c ' , ' d ' , ' e ' , ' f ' , ' g ' , ' h ' , ' j ' , ' k ' , ' l ' , ' m ' , ' ... ) |
source code
|
|
|
|
|
items = [ ' a ' , ' b ' , ' c ' , ' d ' , ' e ' , ' f ' , ' g ' , ' h ' , ' j ' , ' k ' , ' l ' ...
|
|
item_name = { ' a ' : ' press: reportage ' , ' b ' : ' press: editorial ' , ...
|
items
- Value:
[ ' a ' ,
' b ' ,
' c ' ,
' d ' ,
' e ' ,
' f ' ,
' g ' ,
' h ' ,
...
|
|
item_name
- Value:
{ ' a ' : ' press: reportage ' ,
' b ' : ' press: editorial ' ,
' c ' : ' press: reviews ' ,
' d ' : ' religion ' ,
' e ' : ' skill and hobbies ' ,
' f ' : ' popular lore ' ,
' g ' : ' belles-lettres ' ,
' h ' : ' miscellaneous: government & house organs ' ,
...
|
|