SGMLParser problem
Gillou
nospam at bigfoot.com
Fri Nov 8 14:10:41 EST 2002
More information about the Python-list mailing list
Fri Nov 8 14:10:41 EST 2002
- Previous message (by thread): SGMLParser problem
- Next message (by thread): Shelves: Alternatives?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"sanjay" <sanjay2kind at yahoo.com> a écrit dans le message de news: 63170f57.0211080754.4d398296 at posting.google.com... > Hi, > > Any one has suggestion for following problem. Some word documents > have been converted to HTML page in Ms-Word. Want to filter html tags > like.. > <o:p></o:p>, > <![if !supportEmptyParas]> <![endif]>, etc. I couldn't solve > using SGMLParser. Shows error like.. I'm not sure that XML namespace notation is compliant with strict SGML. That's certainly the reason of your exception. As Martin V.Loewis writes, Tidy makes a pretty good cleanup in the strange MS-Word HTML and removes all that's not standard HTML4. Search for it from www.w3.org --Gilles
- Previous message (by thread): SGMLParser problem
- Next message (by thread): Shelves: Alternatives?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list