Regular expressions in python
Harvey Thomas
hst at empolis.co.uk
Wed Jul 3 11:53:09 EDT 2002
More information about the Python-list mailing list
Wed Jul 3 11:53:09 EDT 2002
- Previous message (by thread): Thread safetyness in Python
- Next message (by thread): Regular expressions in python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Graeme Longman [mailto:glongman at ilangua.com] wrote > Hi, > > I'm using the python module re to search through strings of html text > but I have found that it is taking too long using the seach method. > > I am looping though a list of regular expressions and I find that it > takes much longer when no match is found for the expression > than it does > when a match is found. Is this normal ? > > I have fixed the problem for now by using string.find() > before searching > the text but was wondering if anyone had any ideas on a better > technique. > > Is there something else I should be using ? I am using '.*' and > re.DOTALL in my expressions but that doesn't seem to be the problem. > > Thanks for any help in advance. > > Graeme > I've found it is MUCH faster if you convert your list of regular expressions into a set of bracketed expressions separated by | (use re.VERBOSE as well!) and then use re.findall. That way you get a giant list of tuples, with the non-matching expressions returning the empty string. HTH Harvey _____________________________________________________________________ This message has been checked for all known viruses by the MessageLabs Virus Scanning Service.
- Previous message (by thread): Thread safetyness in Python
- Next message (by thread): Regular expressions in python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list