Reg Exp: Need advice concerning "greediness"
Franz GEIGER
fgeiger at datec.at
Wed Oct 4 09:57:30 EDT 2000
More information about the Python-list mailing list
Wed Oct 4 09:57:30 EDT 2000
- Previous message (by thread): Reg Exp: Need advice concerning "greediness"
- Next message (by thread): Request for Python editor modes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> I used a negated character class to force an end for the first group before > a cpossible COLOR tag. Otherwise, what I think is happening is that your That did the trick. > is included into it. BTW, I changed that '*' to '?', which is what you meant, > if I read correctly. Yes. As fascinating reg exp are, they are not always easy to understand and use, especially for newbies. Thanks a lot and best regards Franz Calvelo Daniel <dcalvelo at pharion.univ-lille2.fr> schrieb in im Newsbeitrag: 8r9nc3$8re$1 at netserv.univ-lille1.fr... > Franz GEIGER <fgeiger at datec.at> wrote: > : Hello all, > > : I want to exchange font colors of headings of a certain level in HTML files. > > : I have a line containing a heading level 1, e.g.: <h1><font > : COLOR="#FF0000">Heading Level 1</font></h1>. > > : Now I want to split this into 3 groups: Everything before "COLOR=xyz", > : "COLOR=xyz" itself, and everything after "COLOR=xyz". > > : I tried: > : sRslt = "<h1><font COLOR="#FF0000">Heading Level 1</font></h1>"; > : print re.findall(re.compile(r'(.*?FONT.*?)(COLOR=.*?)*([ |>].*)', re.I | > : re.S), sRslt); > > Beware of quotes in your example: > > >>> sRslt = "<h1><font COLOR="#FF0000">Heading Level 1</font></h1>" > >>> sRslt > '<h1><font COLOR=' > > (That explains weird results reported here) > > As for your regexp, the following works: > > >>> print re.findall(re.compile(r'(.*?FONT[^">]+?)(COLOR=.*?)?([ |>].*)', re.I | re.S), sRslt); > [('<h1><font ', 'COLOR="#FF0000"', '>Heading Level 1</font></h1>')] > > I used a negated character class to force an end for the first group before > a cpossible COLOR tag. Otherwise, what I think is happening is that your > non-greedy search is indeed non-greedy, but the null-match of '(COLOR=.*?)*' > is included into it. BTW, I changed that '*' to '?', which is what you meant, > if I read correctly. > > HTH, DCA > > -- Daniel Calvelo Aros > calvelo at lifl.fr
- Previous message (by thread): Reg Exp: Need advice concerning "greediness"
- Next message (by thread): Request for Python editor modes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list