Most efficient method to search text?
Michael Hudson
mwh at python.net
Thu Oct 17 07:14:09 EDT 2002
More information about the Python-list mailing list
Thu Oct 17 07:14:09 EDT 2002
- Previous message (by thread): Most efficient method to search text?
- Next message (by thread): Most efficient method to search text?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tim Peters <tim.one at comcast.net> writes: > Especially for purposes of building lexers, it might be useful if the re > package could recognize when a DFA approach was sufficient and practical, > and switch to a different scheme entirely then. Or it might not. Build > code to try it both ways, and let us know how it turns out ... Indeed, my code canes re when the wordlist gets long. Here's some numbers (do_comp(n) builds a list of n words composed of ten random ascii characters and searches the Python standard library for them using first my code, then a re-based approach): >>> import robin >>> robin.do_comp(10) 2.275832057 0.317215919495 >>> robin.do_comp(100) 2.2694439888 0.980538964272 >>> robin.do_comp(1000) 2.36677598953 8.53031408787 >>> robin.do_comp(2000) 2.34253299236 16.9222760201 So roughly linear behaviour from re, n-independent behaviour from my code. Isn't it nice when things do what you expect? On the other hand, the "compiler" code I posted yesterday consumes vast gobs of memory -- trying to compile a 10000 long list of random words got the OOM killer into action. A version using lists seems slightly better in this respect, though slower in execution. Cheers, M. -- Have you considered downgrading your arrogance to a reasonable level? -- Erik Naggum, comp.lang.lisp, to yet another C++-using troll
- Previous message (by thread): Most efficient method to search text?
- Next message (by thread): Most efficient method to search text?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list