An XML parser is an XML parser. Period.
Uche Ogbuji
uche at ogbuji.net
Thu Feb 12 09:29:19 EST 2004
More information about the Python-list mailing list
Thu Feb 12 09:29:19 EST 2004
- Previous message (by thread): package similar to XML::Simple
- Next message (by thread): An XML parser is an XML parser. Period.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Peter Hansen: > Hmm... makes me want to check their web site, to see what this is really > about: > > '''RXP is a very fast validating XML parser written by Richard Tobin of > the University of Edinburgh. It complies fully with the W3C test suites > (although we have compiled it without Unicode support for the time being). > We would like to thank Richard Tobin and Henry Thompson of the Language > Technology Group for making this code available to the world. > ''' > > Seems pretty self-explanatory to me. Might even be why, when I downloaded > and tried to use it (and got good results) a year or two ago, I had no > qualms about using it. Clearly stated, and to the point, except that one > is left to make the small connection between "compiled without Unicode > support" and "doesn't handle character entities". (Or is it that it > handles character entities, but not those beyond 127? Probably moot.) > > Doesn't this imply that anyone, at any time, could choose to recompile > *with* Unicode support, which is presumably _in place_ but just optionally > left out of the standard distribution? > > So it's neither a bug, nor a design decision, but a packaging choice. > > I think I'm back to saying that "not an XML parser!!!!" is a bit of an > unfair reaction, given how open they are about the situation. *sigh*. I don't know how many more times and ways I can say this. On more time and I'm done unless a new, salient point comes up. There *is* a packaging of PyRXP that is XML compliant. It's called PyRXPU. It is precisely a compiling of PyRXP with Unicode support plus output of Unicode objects in the resulting data structure (which is my recommendation for XML processing). So once more: AFAICT PyRXPU is an XML parser. PyRXP is certainly not an XML parser. The substrate RXP is not an XML parser either when compiled without Unicode support and although I respect Thompson and Tobin as much as I do the PyRXP developers, they were really confusing themselves and others when they said "It complies fully with the W3C test suites (although we have compiled it without Unicode support for the time being)." Several early times when this issue was brought up the PyRXP developers in effect said approximately: We need it to be fast, so we won't be doing anything to make it conformant because we now doing so would slow it down. This is a pretty poisonous attitude when claimig to support a standard, and what makes this even worse is that the PyRXP Web page starts out saying: "...PyRXP...the fastest validating XML parser available for Python, and quite possibly anywhere :-)." And then goes on to justify that statement with a "benchmark" of PyRXP against other XML parsers without mentioning the inconvenient fact that PyRXP is *not* an XML parser, and that building it so that it is would drop it in the benchmarks somewhat. (Not that I know who should really care because unless you're using 4DOM or minidom all the options are in the same order of magnitude: if you want to wring ut the last odd drop of CPU--and you probably don't need to--then you should be using neither XML nor Python). Are you seriously telling me that in the face of all this, my criticism, strongly worded as it is, is unfair? My main aim here is to make it well known that PyRXP is not an XML parser. It won't trouble me if people continue to use it as currently packaged. I just want to make sure they know they are not using what they may think they are. Once again: PyRXPU (contributed, tellingly, by someone outside the PyRXP core team) is the right build of PyRXP if you need an XML parser. The bad news is that it's only available from ReportLab CVS. My article is now out and includes details for obtaining PyRXPU: http://www.xml.com/pub/a/2004/02/11/py-xml.html --Uche http://uche.ogbuji.net
- Previous message (by thread): package similar to XML::Simple
- Next message (by thread): An XML parser is an XML parser. Period.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list