Issue24009
Created on 2015-04-19 19:14 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue24009_textio_decoder_getstate.patch | serhiy.storchaka, 2015-04-23 18:04 | Get rid of "y#" in textio | review | |
| Messages (9) | |||
|---|---|---|---|
| msg241546 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2015-04-19 19:14 | |
There are a lot of format units supported in PyArg_Parse* functions, but some of them are rarely or never used in current CPython code. Some of format units are legacy from Python 2 and are not needed in modern Python 3 code or can be replaced with custom converter. Here are results of grepping (not including Modules/_testcapimodule.c). "es", "es#", "et#", "z*", "Z#" are not used. "y#": Modules/_io/textio.c:2334: if (!PyArg_ParseTuple(_state, "y#i", &dec_buffer, &dec_buffer_len, &dec_flags)) { \ "z#": Modules/_ctypes/_ctypes.c:3327: if (!PyArg_ParseTuple(args, "is|Oz#", &index, &name, ¶mflags, &iid, &iid_len)) "u#": Modules/arraymodule.c:248: if (!PyArg_Parse(v, "u#;array item must be unicode character", &p, &len)) PC/winreg.c:1547: if (!PyArg_ParseTuple(args, "OZiu#:SetValue", "y": Modules/_io/textio.c:2334: if (!PyArg_ParseTuple(_state, "y#i", &dec_buffer, &dec_buffer_len, &dec_flags)) { \ Modules/_cursesmodule.c:2790: if (!PyArg_ParseTuple(args,"y;str", &str)) Modules/_cursesmodule.c:3026: if (!PyArg_ParseTuple(args, "y|iiiiiiiii:tparm", Modules/posixmodule.c:3767: if (!PyArg_ParseTuple (args, "y:_getfullpathname", Modules/posixmodule.c:3872: if (!PyArg_ParseTuple(args, "y:_isdir", &path)) Modules/faulthandler.c:941: if (!PyArg_ParseTuple(args, "y:fatal_error", &message)) "et": Modules/socketmodule.c:4499: if (!PyArg_ParseTuple(args, "et:gethostbyname", "idna", &name)) Modules/socketmodule.c:4667: if (!PyArg_ParseTuple(args, "et:gethostbyname_ex", "idna", &name)) Modules/socketmodule.c:4744: if (!PyArg_ParseTuple(args, "et:gethostbyaddr", "idna", &ip_num)) Modules/_tkinter.c:2099: if (!PyArg_ParseTuple(args, "et:splitlist", "utf-8", &list)) Modules/_tkinter.c:2162: if (!PyArg_ParseTuple(args, "et:split", "utf-8", &list)) Modules/_ssl.c:3038: if (!PyArg_ParseTupleAndKeywords(args, kwds, "O!iet:_wrap_socket", kwlist, Modules/_ssl.c:3070: if (!PyArg_Parse(hostname_obj, "et", "idna", &hostname)) "s*": Modules/_codecsmodule.c:188: if (!PyArg_ParseTuple(args, "s*|z:escape_decode", Modules/_codecsmodule.c:552: if (!PyArg_ParseTuple(args, "s*|z:unicode_escape_decode", Modules/_codecsmodule.c:569: if (!PyArg_ParseTuple(args, "s*|z:raw_unicode_escape_decode", Modules/_codecsmodule.c:696: if (!PyArg_ParseTuple(args, "s*|z:readbuffer_encode", Modules/_ssl.c:3734: if (!PyArg_ParseTuple(args, "s*d:RAND_add", &view, &entropy)) Modules/fcntlmodule.c:225: if (PyArg_Parse(ob_arg, "s*:ioctl", &pstr)) { Modules/clinic/arraymodule.c.h:278: if (!PyArg_Parse(arg, "s*:fromstring", &buffer)) "s#": Modules/_gdbmmodule.c:128: if (!PyArg_Parse(key, "s#", &krec.dptr, &krec.dsize) ) Modules/_gdbmmodule.c:176: if (!PyArg_Parse(v, "s#", &krec.dptr, &krec.dsize) ) { Modules/_gdbmmodule.c:194: if (!PyArg_Parse(w, "s#", &drec.dptr, &drec.dsize)) { Modules/fcntlmodule.c:71: if (PyArg_Parse(arg, "s#", &str, &len)) { Modules/_ctypes/_ctypes.c:2569: if (!PyArg_ParseTuple(args, "Os#", &dict, &data, &len)) Modules/clinic/unicodedata.c.h:361: if (!PyArg_Parse(arg, "s#:lookup", &name, &name_length)) Modules/clinic/_dbmmodule.c.h:62: if (!PyArg_ParseTuple(args, "s#|O:get", Modules/clinic/_dbmmodule.c.h:95: if (!PyArg_ParseTuple(args, "s#|O:setdefault", Modules/clinic/_gdbmmodule.c.h:150: if (!PyArg_Parse(arg, "s#:nextkey", &key, &key_length)) Modules/_dbmmodule.c:108: if (!PyArg_Parse(key, "s#", &krec.dptr, &tmp_size) ) Modules/_dbmmodule.c:132: if ( !PyArg_Parse(v, "s#", &krec.dptr, &tmp_size) ) { Modules/_dbmmodule.c:150: if ( !PyArg_Parse(w, "s#", &drec.dptr, &tmp_size) ) { Modules/_dbmmodule.c:336: if ( !PyArg_Parse(default_value, "s#", &val.dptr, &tmp_size) ) { In future may be we could deprecate some format units and remove them in 4.0. This issue is a meta issue. Every case should be considered individually. |
|||
| msg241870 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2015-04-23 17:15 | |
In textio.c, the decoder always should return bytes, not arbitrary read-only buffer (this is required in other parts of the code). So "y#" can be replaced with "O" with PyBytes_GET_SIZE. |
|||
| msg242007 - (view) | Author: Tal Einat (taleinat) * ![]() |
Date: 2015-04-25 08:22 | |
+1. I was recently trying to use the C API for a 3rd party library, and all of these subtly different string parameter formats made things surprisingly confusing. These are part of the Python C API, so removing them could break 3rd party code. Simply searching through the stdlib is not enough to show that these are not in use. So removal would require a deprecation period. |
|||
| msg242646 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2015-05-06 06:54 | |
New changeset d65233f630e1 by Serhiy Storchaka in branch 'default': Issue #24009: Got rid of using rare "y#" format unit in TextIOWrapper.tell(). https://hg.python.org/cpython/rev/d65233f630e1 |
|||
| msg242652 - (view) | Author: Ronald Oussoren (ronaldoussoren) * ![]() |
Date: 2015-05-06 10:14 | |
Note that these format characters can also be used outside of CPython. |
|||
| msg242951 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2015-05-12 09:47 | |
Yes, of course, I think we shouldn't drop support of these format units. But using them likely is a sign of outdated or transitional code. It should be discouraged in new code, and every case should be analyzed and cleaned. |
|||
| msg244084 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2015-05-26 06:04 | |
“u#” should not be deprecated without first deprecating “u”, which is less useful due to not returning a buffer length. Also, I have always been mystified about how “s#”, “z#”, “y” and “y#” can properly to return a pointer into a buffer for arbitrary immutable bytes-like objects, without requiring PyBuffer_Release() to be called. Perhaps this is bad design to be discouraged. Or maybe a documentation oversight somewhere. |
|||
| msg244087 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2015-05-26 07:39 | |
“s#”, “z#”, “y” and “y#” work only with read-only buffers, for which PyBuffer_Release() is no-op operation. Initially they was designed for work with old buffer protocol that doesn't support releasing a buffer. |
|||
| msg244088 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2015-05-26 07:46 | |
Yes I just figured out that myself. Specifically, PyBufferProcs.bf_releasebuffer has to be NULL, and the buffer stays alive as long as the object stays alive. Also it looks like I was wrong about “u” being useless. I was tricked by a contradiction in the documentation, but I will try to fix this in a patch to Issue 24278. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:15 | admin | set | github: 68197 |
| 2018-06-14 10:29:46 | taleinat | set | nosy:
- taleinat |
| 2015-05-26 07:46:11 | martin.panter | set | messages: + msg244088 |
| 2015-05-26 07:39:46 | serhiy.storchaka | set | messages: + msg244087 |
| 2015-05-26 06:04:29 | martin.panter | set | nosy:
+ martin.panter messages: + msg244084 |
| 2015-05-12 09:47:24 | serhiy.storchaka | set | messages: + msg242951 |
| 2015-05-06 10:14:20 | ronaldoussoren | set | nosy:
+ ronaldoussoren messages: + msg242652 |
| 2015-05-06 06:54:26 | python-dev | set | nosy:
+ python-dev messages: + msg242646 |
| 2015-04-25 08:22:37 | taleinat | set | nosy:
+ taleinat messages: + msg242007 |
| 2015-04-23 19:01:19 | serhiy.storchaka | set | dependencies: + Convert os._getfullpathname() and os._isdir() to Argument Clinic |
| 2015-04-23 18:04:57 | serhiy.storchaka | set | files: + issue24009_textio_decoder_getstate.patch |
| 2015-04-23 18:04:18 | serhiy.storchaka | set | files: - issue24009_textio_decoder_getstate.patch |
| 2015-04-23 17:15:06 | serhiy.storchaka | set | files:
+ issue24009_textio_decoder_getstate.patch keywords: + patch messages: + msg241870 |
| 2015-04-19 19:14:44 | serhiy.storchaka | create | |

