Python slow for filter scripts
Bengt Richter
bokr at oz.net
Wed Oct 29 22:36:07 EST 2003
More information about the Python-list mailing list
Wed Oct 29 22:36:07 EST 2003
- Previous message (by thread): Python slow for filter scripts
- Next message (by thread): Python slow for filter scripts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 29 Oct 2003 06:34:22 GMT, William Park <opengeometry at yahoo.ca> wrote: >Alex Martelli <aleax at aleax.it> wrote: >> and I'm specifically reading the King James' Bible (an easily >> available text so you can reproduct my results!) and writing > >Can you post URL for the Bible? > Try Project Gutenburg, at http://www.gutenberg.net/ or their new host at http://www.ibiblio.org/gutenberg/ They have a number of bibles in various languages, and a ton (>10,000 e-texts) of other stuff, also some audio texts, apparently. BTW I read somewhere that the BBC is going to make all their archives, video and audio, freely available on the net, except where there is some legal reason they can't. I guess they're a kind of FEF -- Free Entertainment Foundation (thank you British telly owners ;-) Apparently a new King James e-text is at (long URL, or use their search for "bible" (w/o qutoes) and go to entry #16): http://www.ibiblio.org/gutenberg/cgi-bin/sdb/t9.cgi?entry=30&full=yes&ftpsite=http://www.ibiblio.org/gutenberg/ They also have the Koran, BTW. It's interesting to compare word frequencies, e.g., the 20 most frequent (unless I goofed) in the texts I downloaded: "C:\Info\Linguistics\Gutenberg\bible\bible11.txt" 6647: 'LORD' 6649: 'him' 6856: 'is' 6893: 'be' 6971: 'they' 7249: 'for' 7972: 'a' 8388: 'his' 8854: 'I' 8940: 'unto' 9666: 'he' 9760: 'shall' 12353: 'in' 12592: 'that' 12846: 'And' 13429: 'to' 34472: 'of' 38891: 'and' 62135: 'the' "C:\Info\Linguistics\Gutenberg\koran\koran10.txt" 1739: 'ye' 1752: 'with' 1956: 'And' 1979: 'for' 1991: 'who' 2037: 'be' 2108: 'not' 2186: 'that' 2254: 'shall' 2366: 'them' 2575: 'a' 2644: 'they' 2799: 'is' 2900: 'in' 3320: 'God' 5144: 'to' 6855: 'of' 6896: 'and' 10982: 'the' Both start with the-and-of-to ;-) (I hope this does not offend anyone ;-) Regards, Bengt Richter
- Previous message (by thread): Python slow for filter scripts
- Next message (by thread): Python slow for filter scripts
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list