[Python-Dev] bytes type discussion
Stephen J. Turnbull
stephen at xemacs.org
Wed Feb 15 11:06:21 CET 2006
More information about the Python-Dev mailing list
Wed Feb 15 11:06:21 CET 2006
- Previous message: [Python-Dev] bytes type discussion
- Next message: [Python-Dev] bytes type discussion
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>>>>> "Fred" == Fred L Drake, <fdrake at acm.org> writes: Fred> On Tuesday 14 February 2006 22:34, Greg Ewing wrote: >> Seems to me this is a case where you want to be able to change >> encodings in the middle of reading the stream. You start off >> reading the data as ascii, and once you've figured out the >> encoding, you switch to that and carry on reading. Fred> Not quite. The proper response in this case is often to Fred> re-start decoding with the correct encoding, since some of Fred> the data extracted so far may have been decoded incorrectly. Fred> A very carefully constructed application may be able to go Fred> back and re-decode any data saved from the stream with the Fred> previous encoding, but that seems like it would be pretty Fred> fragile in practice. I believe GNU Emacs is currently doing this. AIUI, they save annotations where the codec is known to be non-invertible (eg, two charset-changing escape sequences in a row). I do think this is fragile, and a robust application really should buffer everything it's not sure of decoding correctly. Fred> There may be cases where switching encoding on the fly makes Fred> sense, but I'm not aware of any actual examples of where Fred> that approach would be required. This is exactly what ISO 2022 formalizes: switching encodings on the fly. mboxes of Japanese mail often contain random and unsignaled encoding changes. A terminal emulator may need to switch when logging in to a remote system. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.
- Previous message: [Python-Dev] bytes type discussion
- Next message: [Python-Dev] bytes type discussion
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list