Issue 34145: uuid3 and uuid5 hard to use portably between Python 2 and 3
Created on 2018-07-18 08:46 by rubasov, last changed 2022-04-11 14:59 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 10100 | closed | bradengroom, 2018-10-25 18:29 | |
| Messages (2) | |||
|---|---|---|---|
| msg321870 - (view) | Author: Bence Romsics (rubasov) | Date: 2018-07-18 08:46 | |
The issue I'd like to report may not be an outright bug neither in cPython 2 nor in cPython 3, but more of a wishlist item to help Python programmers writing UUID-handling code that's valid in Python 2 and 3 at the same time.
Please consider these one-liners:
$ python2 -c 'import uuid ; uuid.uuid5(namespace=uuid.UUID("850aeee8-e173-4da1-9d6b-dd06e4b06747"), name="foo")'
$ python3 -c 'import uuid ; uuid.uuid5(namespace=uuid.UUID("850aeee8-e173-4da1-9d6b-dd06e4b06747"), name="foo")'
As long as the 'name' input to uuid.uuid3() or uuid.uuid5() is the literal string type of the relevant Python version there's no problem at all. However if you'd like to handle both unicode and non-unicode 'name' input in valid Python2/3 code then I find that's impossible to express without relying on Python version checking.
cPython2's uuid module is incompatible with unicode input:
$ python2 -c 'import uuid ; uuid.uuid5(namespace=uuid.UUID("850aeee8-e173-4da1-9d6b-dd06e4b06747"), name=u"foo")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python2.7/uuid.py", line 589, in uuid5
hash = sha1(namespace.bytes + name).digest()
UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 0: ordinal not in range(128)
cPython3's uuid module is incompatible with non-unicode input:
$ python3 -c 'import uuid ; uuid.uuid5(namespace=uuid.UUID("850aeee8-e173-4da1-9d6b-dd06e4b06747"), name=b"foo")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.5/uuid.py", line 608, in uuid5
hash = sha1(namespace.bytes + bytes(name, "utf-8")).digest()
TypeError: encoding without a string argument
The reason is obvious looking at the uuid modules' source code:
cPython 2.7:
https://github.com/python/cpython/blob/ea9a0994cd0f4bd37799b045c34097eb21662b3d/Lib/uuid.py#L603
cPython 3.6:
https://github.com/python/cpython/blob/e9e2fd75ccbc6e9a5221cf3525e39e9d042d843f/Lib/uuid.py#L628
Therefore portable code has to resort to version checking like this:
import six
import uuid
if six.PY2:
name = name.encode('utf-8')
uuid.uuid5(namespace=namespace, name=name)
IMHO this inconvenience could be avoided if cPython2's uuid.uuid3() and uuid.uuid5() had been changed to also accept unicode 'name' arguments and encode() them implicitly.
What do you think?
|
|||
| msg360265 - (view) | Author: Zachary Ware (zach.ware) * ![]() |
Date: 2020-01-19 18:21 | |
Python 2.7 has reached EOL, and so this change can no longer be made. Thanks for the idea report and idea anyway, Bence! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:03 | admin | set | github: 78326 |
| 2020-01-19 18:21:29 | zach.ware | set | status: open -> closed versions:
- Python 3.6 messages:
+ msg360265 |
| 2018-10-25 18:37:06 | xtreak | set | nosy:
+ xtreak |
| 2018-10-25 18:29:57 | bradengroom | set | keywords:
+ patch stage: patch review pull_requests: + pull_request9433 |
| 2018-07-18 08:46:07 | rubasov | create | |
