Issue33242
Created on 2018-04-08 21:19 by smurfix, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Messages (4) | |||
|---|---|---|---|
| msg315096 - (view) | Author: Matthias Urlichs (smurfix) * | Date: 2018-04-08 21:19 | |
ctypes should support binary symbols.
Rationale: There's no requirement that the symbol name in question is encoded as ASCII or UTF-8.
>>> import ctypes
>>> t = type('iface', (ctypes.Structure,), {'_fields_': [(b'c_string_symbol', ctypes.CFUNCTYPE(ctypes.c_uint32))]})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '_fields_' must be a sequence of (name, C type) pairs
|
|||
| msg315097 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2018-04-08 21:51 | |
Field names define CField descriptor attributes on the class. Attribute names should be strings, not bytes. There's no syntactically clean way to use a bytes name. Consider the example of a generic property on a class:
>>> T = type('T', (), {b'p': property(lambda s: 0)})
>>> t = T()
>>> t.p
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'T' object has no attribute 'p'
>>> getattr(t, b'p')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
We'd have to dig into the class dict and manually bind the property:
>>> vars(T)[b'p'].__get__(t)
0
|
|||
| msg315098 - (view) | Author: Matthias Urlichs (smurfix) * | Date: 2018-04-08 22:27 | |
Well, the original problem remains: symbol names aren't constrained to UTF-8 … so if I happen to stumble onto one of those (maybe generated by a code obfuscator), the answer is "don't use Python3 then"? |
|||
| msg315099 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2018-04-08 23:15 | |
If you're automatically wrapping a C source file and don't know the source encoding, you could naively decode it as Latin-1. You're still faced with the problem of characters that Python doesn't allow in identifiers. For example, gcc allows "$" in C identifiers (e.g. a field named "egg$"), but Python doesn't allow this character. At least you can use getattr() to access such names. For example:
>>> s = bytes(range(256)).decode('latin-1')
>>> T = type('T', (), {s: 0})
>>> t = T()
>>> getattr(t, s)
0
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:59 | admin | set | github: 77423 |
| 2018-04-08 23:15:38 | eryksun | set | messages: + msg315099 |
| 2018-04-08 22:27:31 | smurfix | set | messages: + msg315098 |
| 2018-04-08 21:54:15 | eryksun | set | resolution: not a bug -> rejected |
| 2018-04-08 21:51:45 | eryksun | set | status: open -> closed nosy:
+ eryksun resolution: not a bug |
| 2018-04-08 21:32:52 | ned.deily | set | nosy:
+ amaury.forgeotdarc, belopolsky, meador.inge |
| 2018-04-08 21:19:25 | smurfix | create | |
