Regex Question
Frank Koshti
frank.koshti at gmail.com
Sat Aug 18 16:18:52 EDT 2012
More information about the Python-list mailing list
Sat Aug 18 16:18:52 EDT 2012
- Previous message (by thread): Regex Question
- Next message (by thread): Regex Question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 18, 12:22 pm, Jussi Piitulainen <jpiit... at ling.helsinki.fi> wrote: > Frank Koshti writes: > > not always placed in HTML, and even in HTML, they may appear in > > strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue > > is I need to match, process and replace $foo(x=3), knowing that > > (x=3) is optional, and the token might appear simply as $foo. > > > To do this, I decided to use: > > > re.compile('\$\w*\(?.*?\)').findall(mystring) > > > the issue with this is it doesn't match $foo by itself, and requires > > there to be () at the end. > > Adding a ? after the meant-to-be-optional expression would let the > regex engine know what you want. You can also separate the mandatory > and the optional part in the regex to receive pairs as matches. The > test program below prints this: > > >$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm > > ('$foo', '') > ('$foo', '(bar=3)') > ('$foo', '($)') > ('$foo', '') > ('$bar', '(v=0)') > > Here is the program: > > import re > > def grab(text): > p = re.compile(r'([$]\w+)([(][^()]+[)])?') > return re.findall(p, text) > > def test(html): > print(html) > for hit in grab(html): > print(hit) > > if __name__ == '__main__': > test('>$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm') You read my mind. I didn't even know that's possible. Thank you-
- Previous message (by thread): Regex Question
- Next message (by thread): Regex Question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list