Issue 4997: xml.sax.saxutils.XMLGenerator should write to io.RawIOBase.

Issue4997

Created on 2009-01-19 10:35 by kawai, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
xmlgen2.patch craigh, 2009-07-22 22:53 review
xmlgen-doc.patch craigh, 2009-07-22 22:54 review
Messages (8)
msg80155 - (view) Author: HiroakiKawai (kawai) Date: 2009-01-19 10:35
xml.sax.saxutils.XMLGenerator._write tests the argument by 
isinstance(text, str), but this is problematic in Python 3.0. 
XMLGenerator accepts encoding and the produced file is encoded by that 
encoding, i.e., the XML is a binary sequence. So IMHO, the XMLGenerator 
constructor argument should be a subclass of io.RawIOBase.
msg90823 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 21:43
To clarify the specific problem:

- If the file object passed to XMLGenerator is opened in binary mode,
XMLGenerator raises TypeError as soon as it tries to write to it
- If the passed file object is opened in text mode, XMLGenerator writes
the prescribed encoding to the XML declaration but it actually uses the
file object's encoding when writing everything
msg90826 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:02
Patch attached.  This patch doesn't actually restrict the output object
to RawIOBase (that wouldn't work well, since files opened as binary are
actually derived from BufferedIOBase).  Instead, it just assumes the
output object has a 'write' method that accepts a single bytes argument.
 Also, XMLGenerator no longer needs to check if the input is str or unicode.
msg90829 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:35
Actually, that patch may not work so well either... out defaults to
sys.stdout, but that can't accept bytes.
msg90831 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:53
This new patch removes the "default to stdout" behavior.
msg90832 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:54
Patch for documentation.
msg91310 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-05 10:59
You shouldn't remove the defaulting behaviour for `out`, but use
`sys.stdout.buffer` instead.

Bonus points if you add a test so that this kind of bug doesn't go
unnoticed again.

PS: it's ironic that the default encoding here is iso-8859-1. This piece
of code is really getting old.
msg165510 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-07-15 07:38
This issue will be fixed by patch for issue1470548.
History
Date User Action Args
2022-04-11 14:56:44adminsetgithub: 49247
2012-10-24 09:45:17serhiy.storchakasetstatus: open -> closed
2012-08-05 11:14:07serhiy.storchakasetsuperseder: xml.sax.saxutils.XMLGenerator cannot output UTF-16
resolution: duplicate
2012-07-15 07:38:28serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg165510
2009-08-05 10:59:32pitrousetnosy: + pitrou
messages: + msg91310
2009-07-22 22:56:08craighsetfiles: - xmlgen.patch
2009-07-22 22:54:23craighsetfiles: + xmlgen-doc.patch

messages: + msg90832

2009-07-22 22:53:26craighsetfiles: + xmlgen2.patch

messages: + msg90831

2009-07-22 22:35:26craighsetmessages: + msg90829
2009-07-22 22:02:25craighsetfiles: + xmlgen.patch
keywords: + patch
messages: + msg90826
2009-07-22 21:43:31craighsetnosy: + craigh
messages: + msg90823
2009-01-19 10:35:57kawaicreate