Message 137602 - Python tracker

Message137602

Author	vstinner
Recipients	cdqzzy, ezio.melotti, hyeshik.chang, lemburg, python-dev, terry.reedy, vstinner
Date	2011-06-03.22:20:01
SpamBayes Score	1.688097e-05
Marked as misclassified	No
Message-id	<1307139604.35.0.517194006092.issue12016@psf.upfronthosting.co.za>
In-reply-to

Content
cjk_decode.patch: - patch all CJK decoders to replace only the first byte of an invalid byte sequence (by U+FFFD). Example from the issue title: b'\xff\n'.decode('gb2312', 'replace') gives now '�\n' instead of just '�' - add at least one unit test for each path in the decoder (sometimes it was really hard to see how to go into a specific path, especially for the johab decoder!) - add testcases for euc_jis_2004 and shift_jis_2004 - factorize "codec tests" (codectests) of all japanese EUC tests (euc_commontests) Because I consider this issue as a bug, I would like to apply this patch to 2.7, 3.2 and 3.3.

Content

cjk_decode.patch:
 - patch *all* CJK decoders to replace only the first byte of an invalid byte sequence (by U+FFFD). Example from the issue title: b'\xff\n'.decode('gb2312', 'replace') gives now '�\n' instead of just '�'
 - add at least one unit test for *each* path in the decoder (sometimes it was really hard to see how to go into a specific path, especially for the johab decoder!)
 - add testcases for euc_jis_2004 and shift_jis_2004
 - factorize "codec tests" (codectests) of all japanese EUC tests (euc_commontests)

Because I consider this issue as a bug, I would like to apply this patch to 2.7, 3.2 and 3.3.

History
Date	User	Action	Args
2011-06-03 22:20:04	vstinner	set	recipients: + vstinner, lemburg, terry.reedy, hyeshik.chang, ezio.melotti, python-dev, cdqzzy
2011-06-03 22:20:04	vstinner	set	messageid: <1307139604.35.0.517194006092.issue12016@psf.upfronthosting.co.za>
2011-06-03 22:20:03	vstinner	link	issue12016 messages
2011-06-03 22:20:03	vstinner	create