Newbie question about text encoding
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Thu Feb 26 18:09:41 EST 2015
More information about the Python-list mailing list
Thu Feb 26 18:09:41 EST 2015
- Previous message (by thread): Newbie question about text encoding
- Next message (by thread): Newbie question about text encoding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Chris Angelico wrote: > Unicode > isn't about taking everyone's separate character sets and numbering > them all so we can reference characters from anywhere; if you wanted > that, you'd be much better off with something that lets you specify a > code page in 16 bits and a character in 8, which is roughly the same > size as Unicode anyway. Well, except for the approximately 25% of people in the world whose native language has more than 256 characters. It sounds like you are referring to some sort of "shift code" system. Some legacy East Asian encodings use a similar scheme, and depending on how they are implemented they have great disadvantages. For example, Shift-JIS suffers from a number of weaknesses including that a single byte corrupted in transmission can cause large swaths of the following text to be corrupted. With Unicode, a single corrupted byte can only corrupt a single code point. -- Steven
- Previous message (by thread): Newbie question about text encoding
- Next message (by thread): Newbie question about text encoding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list