pattern matching
John S
jstrickler at gmail.com
Thu Feb 24 07:29:46 EST 2011
More information about the Python-list mailing list
Thu Feb 24 07:29:46 EST 2011
- Previous message (by thread): pattern matching
- Next message (by thread): pattern matching
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Feb 23, 9:11 pm, monkeys paw <mon... at joemoney.net> wrote: > if I have a string such as '<td>01/12/2011</td>' and i want > to reformat it as '20110112', how do i pull out the components > of the string and reformat them into a YYYYDDMM format? > > I have: > > import re > > test = re.compile('\d\d\/') > f = open('test.html') # This file contains the html dates > for line in f: > if test.search(line): > # I need to pull the date components here What you need are parentheses, which capture part of the text you're matching. Each set of parentheses creates a "group". To get to these groups, you need the match object which is returned by re.search. Group 0 is the entire match, group 1 is the contents of the first set of parentheses, and so forth. If the regex does not match, then re.search returns None. DATA FILE (test.html): <table> <tr><td>David</td><td>02/19/1967</td></tr> <tr><td>Susan</td><td>05/23/1948</td></tr> <tr><td>Clare</td><td>09/22/1952</td></tr> <tr><td>BP</td><td>08/27/1990</td></tr> <tr><td>Roger</td><td>12/19/1954</td></tr> </table> CODE: import re rx_test = re.compile(r'<td>(\d{2})/(\d{2})/(\d{4})</td>') f = open('test.html') for line in f: m = rx_test.search(line) if m: new_date = m.group(3) + m.group(1) + m.group(2) print "raw text: ",m.group(0) print "new date: ",new_date print OUTPUT: raw text: <td>02/19/1967</td> new date: 19670219 raw text: <td>05/23/1948</td> new date: 19480523 raw text: <td>09/22/1952</td> new date: 19520922 raw text: <td>08/27/1990</td> new date: 19900827 raw text: <td>12/19/1954</td> new date: 19541219
- Previous message (by thread): pattern matching
- Next message (by thread): pattern matching
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list