Thanks for the test case. I reproduced it easily.
There is indeed a real problem in CGI streams.
The first thing to do is to start python with the -u option (add it to
the end of the first #! line), so that stdin yields bytes instead of
unicode chars, and \r\n are not translated on Windows.
Even then, I noticed that in the multipart/form-data section, text
fields are utf-8 encoded, but the file content is raw binary.
(FWIW, I use Firefox and Apache on Windows)
No encoding seems to be specified, neither in the content, nor in the
environment (no HTTP_TRANSFER_ENCODING)
And of course, the email.parser.FeedParser object used to parse it
accepts only unicode, not bytes.
Help needed. |