[Python-ideas] PEP 540: Add a new UTF-8 mode
INADA Naoki
songofacandy at gmail.com
Wed Jan 11 03:17:46 EST 2017
More information about the Python-ideas mailing list
Wed Jan 11 03:17:46 EST 2017
- Previous message (by thread): [Python-ideas] PEP 540: Add a new UTF-8 mode
- Next message (by thread): [Python-ideas] PEP 540: Add a new UTF-8 mode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Here is one example of locale pitfall. --- # from http://unix.stackexchange.com/questions/169739/why-is-coreutils-sort-slower-than-python $ cat letters.py import string import random def main(): for _ in range(1_000_000): c = random.choice(string.ascii_letters) print(c) main() $ python3 letters.py > letters.txt $ LC_ALL=C time sort letters.txt > /dev/null 0.35 real 0.32 user 0.02 sys $ LC_ALL=C.UTF-8 time sort letters.txt > /dev/null 0.36 real 0.33 user 0.02 sys $ LC_ALL=ja_JP.UTF-8 time sort letters.txt > /dev/null 11.03 real 10.95 user 0.04 sys $ LC_ALL=en_US.UTF-8 time sort letters.txt > /dev/null 11.05 real 10.97 user 0.04 sys --- This is why some engineer including me use C locale on Linux, at least when there are no C.UTF-8 locale. Off course, we can use LC_CTYPE=en_US.UTF-8, instead of LANG or LC_ALL. (I wonder if we can use LC_CTYPE=UTF-8...) But I dislike current situation that "people should learn how to configure locale properly, and pitfall of non-C locale, only for using UTF-8 on Python".
- Previous message (by thread): [Python-ideas] PEP 540: Add a new UTF-8 mode
- Next message (by thread): [Python-ideas] PEP 540: Add a new UTF-8 mode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-ideas mailing list