Issue 35714: Document that the null character '\0' terminates a struct format spec
Created on 2019-01-10 22:42 by bup, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (12)
msg333424 - (view)
Author: Dan Snider (bup) *
Date: 2019-01-10 22:42
Date: 2019-01-11 01:03
Date: 2019-01-11 07:28
Date: 2019-10-26 08:39
Date: 2019-10-26 09:50
Date: 2020-05-25 07:55
Date: 2020-05-26 07:10
Date: 2020-05-26 08:57
Date: 2021-06-18 19:03
Date: 2021-06-19 09:17
ie.:
>>> from struct import calcsize
>>> calcsize('\144\u0064\000xf\U00000031000\60d\121\U00000051')
16
I'm sure some people think it's obvious or even expect the null character to signal EOF but it probably isn't obvious at all to those without experience in lower level languages. It actually seems like Python goes out of its way to make sure everything treats the null character no more special than the letter "H", which is good.
At first glance I'd think something like this was just another trivial quirk of the language and not bring it up, but because the documentation doesn't mention it I actually got stuck on something related for half an hour when unit testing some dynamically generated format specs.
Without going into unnecessary detail, what happened was that a typo in another tangentially related part of the test was enabling the generation of a rogue null byte. I'm bad at those "find face in the crowd" puzzles and this was hardly different, being literally camouflaged within a 300 character format spec containing a random mixture of escaped and non-escaped source characters in the forms: \Uffffffff, \uffff, \777, \xff, \x00, + latin/ascii.
If I'm not the only one who sees this as a slightly bigger deal than poor documentation, the fix is trivial with an extra call to PyBytes_GET_SIZE when null is found. But just because I can't think of a use case in allowing the null character to precede other characters in the format string doesn't mean there isn't one, which is why only documentation is currently selected.
msg333430 - (view)
Author: Steven D'Aprano (steven.daprano) *
Date: 2019-01-11 01:03
I'm not sure whether having NULLs terminate a struct format string is a feature or a bug. Given that nearly every other string in Python treat NULLs as ordinary characters, I'm inclined to say this is a bug. Or at least an unnecessary restriction that ought to be lifted.msg333441 - (view) Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2019-01-11 07:28
I think the null character is illegal character in the format string, and struct functions should raise a struct.error for it.msg355407 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2019-10-26 08:39
I agree with Serhiy. Any other unrecognised character would raise an error. The null character should do the same.msg355410 - (view) Author: Zackery Spytz (ZackerySpytz) *
Date: 2019-10-26 09:50
I've created a patch to reject null characters in the format string.msg369859 - (view) Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2020-05-25 07:55
New changeset 3f59b55316f4c6ab451997902579aa69020b537c by Zackery Spytz in branch 'master': bpo-35714: Reject null characters in struct format strings (GH-16928) https://github.com/python/cpython/commit/3f59b55316f4c6ab451997902579aa69020b537cmsg369950 - (view) Author: miss-islington (miss-islington) Date: 2020-05-26 07:05
New changeset 5221a10dde4a3853fe7ace316d95767648055109 by Miss Islington (bot) in branch '3.9': bpo-35714: Reject null characters in struct format strings (GH-16928) https://github.com/python/cpython/commit/5221a10dde4a3853fe7ace316d95767648055109msg369951 - (view) Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2020-05-26 07:10
Zackery, do you mind to create a backport to 3.8?msg369959 - (view) Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2020-05-26 08:57
New changeset 5ff5edfef63b3dbc1abb004b3fa4b3db87e79ff9 by Zackery Spytz in branch '3.8': [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) (GH-20419) https://github.com/python/cpython/commit/5ff5edfef63b3dbc1abb004b3fa4b3db87e79ff9msg369961 - (view) Author: miss-islington (miss-islington) Date: 2020-05-26 09:16
New changeset 4ea802868460fad54e40cb99eb0ca283b3b293f0 by Miss Islington (bot) in branch '3.7': [3.8] bpo-35714: Reject null characters in struct format strings (GH-16928) (GH-20419) https://github.com/python/cpython/commit/4ea802868460fad54e40cb99eb0ca283b3b293f0msg396078 - (view) Author: Irit Katriel (iritkatriel) *
Date: 2021-06-18 19:03
This seems resolved, can it be closed?msg396121 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2021-06-19 09:17
Yes, this looks closeable. Thank you!
History
Date
User
Action
Args
2022-04-11 14:59:10adminsetgithub: 79895
2021-06-19 09:17:15mark.dickinsonsetstatus: open -> closed
resolution: fixed
messages: + msg396121
messages: + msg396078
2020-05-26 09:16:42miss-islingtonsetmessages: + msg369961 2020-05-26 08:57:22miss-islingtonsetpull_requests: + pull_request19679 2020-05-26 08:57:18serhiy.storchakasetmessages: + msg369959 2020-05-26 08:32:45ZackerySpytzsetpull_requests: + pull_request19678 2020-05-26 07:10:11serhiy.storchakasetmessages: + msg369951 2020-05-26 07:05:02miss-islingtonsetmessages: + msg369950 2020-05-25 07:55:52miss-islingtonsetpull_requests: + pull_request19639 2020-05-25 07:55:43miss-islingtonsetpull_requests: + pull_request19638 2020-05-25 07:55:35miss-islingtonsetpull_requests: + pull_request19637 2020-05-25 07:55:25miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request19636
2020-05-25 07:55:13serhiy.storchakasetmessages: + msg369859 2019-10-26 09:50:50ZackerySpytzsetnosy: + ZackerySpytz
messages: + msg355410
2019-10-26 08:39:22mark.dickinsonsetnosy: + mark.dickinson
messages: + msg355407
2019-10-26 05:48:41ZackerySpytzsetkeywords: + patch
stage: patch review
pull_requests: + pull_request16457 2019-01-11 07:28:15serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg333441
2019-01-11 01:03:27steven.dapranosettype: behavior
resolution: fixed
messages: + msg396121
stage: patch review -> resolved
2021-06-18 19:03:19iritkatrielsetnosy: + iritkatrielmessages: + msg396078
2020-05-26 09:16:42miss-islingtonsetmessages: + msg369961 2020-05-26 08:57:22miss-islingtonsetpull_requests: + pull_request19679 2020-05-26 08:57:18serhiy.storchakasetmessages: + msg369959 2020-05-26 08:32:45ZackerySpytzsetpull_requests: + pull_request19678 2020-05-26 07:10:11serhiy.storchakasetmessages: + msg369951 2020-05-26 07:05:02miss-islingtonsetmessages: + msg369950 2020-05-25 07:55:52miss-islingtonsetpull_requests: + pull_request19639 2020-05-25 07:55:43miss-islingtonsetpull_requests: + pull_request19638 2020-05-25 07:55:35miss-islingtonsetpull_requests: + pull_request19637 2020-05-25 07:55:25miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request19636
2020-05-25 07:55:13serhiy.storchakasetmessages: + msg369859 2019-10-26 09:50:50ZackerySpytzsetnosy: + ZackerySpytz
messages: + msg355410
2019-10-26 08:39:22mark.dickinsonsetnosy: + mark.dickinson
messages: + msg355407
2019-10-26 05:48:41ZackerySpytzsetkeywords: + patch
stage: patch review
pull_requests: + pull_request16457 2019-01-11 07:28:15serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg333441
2019-01-11 01:03:27steven.dapranosettype: behavior
messages:
+ msg333430
nosy:
+ steven.daprano