[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Chris Angelico
rosuav at gmail.com
Sun Jan 12 23:28:31 CET 2014
More information about the Python-Dev mailing list
Sun Jan 12 23:28:31 CET 2014
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop <juraj.sukop at gmail.com> wrote: > On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano <steve at pearwood.info> > wrote: >> First, "utf16_string" confuses me. What is it? If it is a Unicode >> string, i.e.: > > It is a Unicode string which happens to contain code points outside U+00FF > (as with the TTF example above), so that it triggers the (at least) 2-bytes > memory representation in CPython 3.3+. I agree, I chose the variable name > poorly, my bad. When I'm talking about Unicode strings based on their maximum codepoint, I usually call them something like "ASCII string", "Latin-1 string", "BMP string", and "SMP string". Still not wholly accurate, but less confusing than naming an encoding... oh wait, two of those _are_ encodings :| But you could use "narrow string" for the first two. Or "string(0..127)" for ASCII, "string(0..255)" for Latin-1, and then for consistency "string(0..65535)" and "string(0..1114111)" for the others, except that I doubt that'd be helpful :) At any rate, "BMP" as a term for "includes characters outside of Latin-1 but all on the Basic Multilingual Plane" would probably be close enough to get away with. ChrisA
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list