[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Daniel Holth
dholth at gmail.com
Wed Jan 8 14:56:37 CET 2014
More information about the Python-Dev mailing list
Wed Jan 8 14:56:37 CET 2014
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jan 7, 2014 at 10:36 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > Daniel Holth writes: > > > Isn't it true that if you have bytes > 127 or surrogate escapes then > > encoding to latin1 is no longer as fast as memcpy? > > Be careful. As phrased, the question makes no sense. You don't "have > bytes" when you are encoding, you have characters. > > If you mean "what happens when my str contains characters in the range > 128-255?", the answer is encoding a str in 8-bit representation to > latin1 is effectively memcpy. If you read in latin1, it's memcpy all > the way (unless you combine it with a non-latin1 string, in which case > you're in the cases below). > > If you mean "what happens when my str contains characters in the range >> 255", you have to truncate 16-bit units to 8 bit units; no memcpy. > > Surrogates require >= 16 bits; no memcpy. That is neat.
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list