Most efficient method to search text?
Bengt Richter
bokr at oz.net
Fri Oct 18 14:32:20 EDT 2002
More information about the Python-list mailing list
Fri Oct 18 14:32:20 EDT 2002
- Previous message (by thread): Most efficient method to search text?
- Next message (by thread): Most efficient method to search text?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, 18 Oct 2002 10:03:10 GMT, Michael Hudson <mwh at python.net> wrote: >bokr at oz.net (Bengt Richter) writes: > >> On Thu, 17 Oct 2002 11:14:09 GMT, Michael Hudson <mwh at python.net> wrote: >> >> >Tim Peters <tim.one at comcast.net> writes: >> > >> >> Especially for purposes of building lexers, it might be useful if the re >> >> package could recognize when a DFA approach was sufficient and practical, >> >> and switch to a different scheme entirely then. Or it might not. Build >> >> code to try it both ways, and let us know how it turns out ... >> > >> >Indeed, my code canes re when the wordlist gets long. Here's some >> >> If this is easy to add to your test harness, I'd be interested to see what >> this search does in comparison, with the longer word lists (it's probably >> faster than has_word.py that I posted elsewhere, depending on relative lengths >> of word lists and strings, and internal vs external looping and allocation. > >It's quick: > >>>> robin.do_comp(1000) >compile... 3.42289197445 >compile2... 2.49848008156 >compile3... 0.696313977242 >compile4... 4.04265594482 >compile_re... 0.627331018448 >compile_bengt... 0.00175499916077 > >test... 1.39854204655 >test2... 2.93543899059 >test3... 3.2231388092 >test4... 2.15867292881 >test_re... 8.38554108143 >test_bengt... 0.437232971191 > >I'm sure Tim once said something along the lines of "Python doesn't >give much advice for getting good performance, beyond a not-so-subtle >hint to exploit dicts for all they're worth" but I can't find it now. > Thanks. I wonder what would happen if you used word lists and files guaranteed *not* to have any hits, so they'd all be searched to the end. Or what was your mix (no of search words, length of files, hit distribution)? >I'm sure Tim once said something along the lines of "Python doesn't >give much advice for getting good performance, beyond a not-so-subtle >hint to exploit dicts for all they're worth" but I can't find it now. > There's probably a lot to learn from the dictionary implementation code ... one of these days ;-) >Cheers, >M. >hmm, look at that sig... > >-- > Premature optimization is the root of all evil. > -- Donald E. Knuth, Structured Programming with goto Statements Premature celebration is a related problem ;-) Regards, Bengt Richter
- Previous message (by thread): Most efficient method to search text?
- Next message (by thread): Most efficient method to search text?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list