DOM with HTML
Alessio Pace
puccio_13 at yahoo.it
Tue Jul 1 04:32:38 EDT 2003
More information about the Python-list mailing list
Tue Jul 1 04:32:38 EDT 2003
- Previous message (by thread): automatically dl pages which needs cookie
- Next message (by thread): DOM with HTML
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, I need to get a sort of DOM from an HTML page that is declared as XHTML but unfortunately is *not* xhtml valid.. If I try to parse it with xml.dom.minidom I get error with expat (as I supposed), so I was told to try in this way, with a "forgiving" html parser: from xml.dom.ext.reader import HtmlLib reader = HtmlLib.Reader() dom = reader.fromUri(url) # 'url' the web page FIRST ISSUE: It seemed to me, reading the source code in $MY_PYTHON_INSTALLATION_DIR/site-packages/_xmlplus/dom/ext/reader/ , that these are 4DOM APIs , so from what I know of python distributions, they are extra packages, or not? I would like to use *only* libs that are available in the python2.2 suite, not any extra. SECOND ISSUE: If the above libs were included in python (and so I would continue using them), how do I print a string representation of a (sub) tree of the DOM? I tried with .toxml() as in the XML tutorial but that method does not exist for the FtNode objects that are involved there... Any idea?? Thanks so much for who can help me -- bye Alessio Pace
- Previous message (by thread): automatically dl pages which needs cookie
- Next message (by thread): DOM with HTML
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list