htmllib.py and parsing malformed HTML
KC
nskhcarlso at bellsouth.net
Mon Sep 1 22:15:51 EDT 2003
More information about the Python-list mailing list
Mon Sep 1 22:15:51 EDT 2003
- Previous message (by thread): htmllib.py and parsing malformed HTML
- Next message (by thread): htmllib.py and parsing malformed HTML [SOLVED]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I have written a parser using htmllib.HTMLParser and it functions fine unless the HTML is malformed. For example, is some instances, the provider of the HTML leaves out the <TR> tags but includes the </TR> tags. Apparently, htmllib and more likely sgmllib do not parse an end tag if a corresponding start tag was not found. Does anyone know a way to "fool" the parser into handling the end tag is a start tag was not found? Thanks, Kevin
- Previous message (by thread): htmllib.py and parsing malformed HTML
- Next message (by thread): htmllib.py and parsing malformed HTML [SOLVED]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list