Double replace or single re.sub?
Mike Meyer
mwm at mired.org
Wed Oct 26 09:22:21 EDT 2005
More information about the Python-list mailing list
Wed Oct 26 09:22:21 EDT 2005
- Previous message (by thread): Double replace or single re.sub?
- Next message (by thread): Double replace or single re.sub?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Iain King" <iainking at gmail.com> writes: > I have some code that converts html into xhtml. For example, convert > all <i> tags into <em>. Right now I need to do to string.replace calls > for every tag: > > html = html.replace('<i>','<em>') > html = html.replace('</i>','</em>') > > I can change this to a single call to re.sub: > > html = re.sub('<([/]*)i>', r'<\1em>', html) > > Would this be a quicker/better way of doing it? Maybe. You could measure it and see. But neither will work in the face of attributes or whitespace in the tag. If you're going to parse [X]HTML, you really should use tools that are designed for the job. If you have well-formed HTML, you can use the htmllib parser in the standard library. If you have the usual crap one finds on the web, I recommend BeautifulSoup. <mike -- Mike Meyer <mwm at mired.org> http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
- Previous message (by thread): Double replace or single re.sub?
- Next message (by thread): Double replace or single re.sub?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list