Python 3.0 automatic decoding of UTF16
MRAB
google at mrabarnett.plus.com
Fri Dec 5 20:05:17 EST 2008
More information about the Python-list mailing list
Fri Dec 5 20:05:17 EST 2008
- Previous message (by thread): Python 3.0 automatic decoding of UTF16
- Next message (by thread): Python 3.0 automatic decoding of UTF16
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
John Machin wrote: > On Dec 6, 10:35 am, Steven D'Aprano <st... at REMOVE-THIS- > cybersource.com.au> wrote: >> On Fri, 05 Dec 2008 12:00:59 -0700, Joe Strout wrote: >>>> So UTF-16 has an explicit EOF marker within the text? >>> No, it does not. I don't know what Terry's thinking of there, but text >>> files do not have any EOF marker. They start at the beginning >>> (sometimes including a byte-order mark), and go till the end of the >>> file, period. >> Windows text files still interpret ctrl-Z as EOF, or at least Windows XP >> does. Vista, who knows? > > This applies only to files being read in an 8-bit text mode. It is > inherited from MS-DOS, which followed the CP/M convention, which was > necessary because CP/M's file system recorded only the physical file > length in 128-byte sectors, not the logical length. It is likely to > continue in perpetuity, just as standard railway gauge is (allegedly) > based on the axle-length of Roman chariots. > The chariots in question were drawn by 2 horses, so the gauge is based in the width of a horse. :-) > None of this is relevant to the OP's problem; his file appears to have > been truncated rather than having spurious data appended to it.
- Previous message (by thread): Python 3.0 automatic decoding of UTF16
- Next message (by thread): Python 3.0 automatic decoding of UTF16
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list