bpo-30162: Add _PyTuple_Empty and make PyTuple_New(0) never failing. by serhiy-storchaka · Pull Request #1283

bpo-30162: Add _PyTuple_Empty and make PyTuple_New(0) never failing. by serhiy-storchaka · Pull Request #1283 · python/cpython

The empty tuple is now allocated in static memory rather than dynamic memory.


		Py_DECREF(args);
		return retval;
		if (!PyArg_ParseTupleAndKeywords(_PyTuple_Empty, state, "\|$OOOOO", kwlist,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sprinkling the code base with references to _PyTuple_Empty seems like a regression to me. I'd rather you keep the PyTuple_New(0) calls.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main purpose of this patch. If use the PyTuple_New(0) calls you need to save the result of PyTuple_New(0), save the result of the function that uses an empty tuple, decref the saved result of PyTuple_New(0). Two additional variables (or one variable and duplicated Py_DECREF), several additional lines. _PyTuple_Empty is added for convenience, it is convenient as Py_None or Py_True.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of like the _PyTuple_Empty ... if I'm trying to understand the function, knowing that something is empty saves a whole branch of analysis. If I happen to know that PyTuple_New(0) returns an empty tuple (as opposed to thinking it should, but knowing some assumption is wrong), then I can still get there, but ... not as quickly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason not to define _PyTuple_Empty as a public API similar to Py_None and Py_True?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of using _PyTuple_Empty instead of PyTupleNew(0) internally.
It makes it explicit that it cannot fail.

No need to add it to the API unless explicitly requested by a significant body of users.

		PyTupleObject obj;
		} _PyTuple_EmptyStruct = {
		.gc_head = { .gc = {
		.gc_refs = (((size_t)_PyGC_REFS_UNTRACKED) << _PyGC_REFS_SHIFT)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly this is ugly :-( Can't you just allocate _PyTuple_Empty in PyTuple_Init?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this is ugly. But there is no PyTuple_Init, and even if add it how can we be sure that _PyTuple_Empty is not used before PyTuple_Init?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably add a comment above it saying so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you at least try to factor out the gc_head initialization thing? For example by defining a macro? Leaking private GC management details in tupleobject.c does not look desirable at all.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to do something with this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be much more inclined to figure out where super early on in Py_Initialize we can allocate the empty PyTuple using the old PyTuple_New(0) code path. I don't like being inconsistent about where pointers to PyObjects come from. All tuples should be Py_DECREF'ed to 0 during interpreter shutdown but code attempting to free this one would fail in a bad way as it isn't on the heap.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpshead I don't see why all tuples need be decrefed to zero. None, True and False are statically allocated, why not ()?


		.. versionchanged:: 3.7

		``PyTuple_New(0)`` always succeeds and never returns NULL.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make this a public API promise? IMHO it looks more like an implementation detail and may possibly change in future?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need, but I consider this as an enhancement. Yes, this promise adds limitations on future changes, but I afraid that if don't document this property explicitly, we can forgot about this and unintentionally change it. However changing this makes using of _PyTuple_Empty unsafe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comments around PyTuple_New enough if we just don't want to forget about it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be enough. But with this guarantee other users can get a benefit from this. I asked on Python-Dev whether it is good to add such promise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The promise is essential to its usefulness. Also, it is an easy thing for an implementation to guarantee.

		p2cread, p2cwrite, c2pread, c2pwrite,
		errread, errwrite, errpipe_read, errpipe_write,
		close_fds, restore_signals, call_setsid,
		py_fds_to_keep, preexec_fn, preexec_fn_args_tuple);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of where the change helped... I just went to some pains to verify that preexec_fn_args_tuple is really always empty (or a null pointer) ... if so, why bother giving it such a complicated name. (Particularly since I first misread it as "table" rather than "tuple". But it seems to be true, and specifying _empty makes it easier to understand.
I have not done the work of verifying that it is OK to drop the final parameter to child_exec, etc... but assuming that it is, I appreciate the further simplification.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it that name to be explicit about what it was for. I like the cleanup this enabled. Within subprocess memory allocation such as PyTuple_New(0) could have done (even though it wouldn't have realistically) is forbidden within the fork-exec path of the child process so it "allocated" in the parent and passed in.

		second_argument = _PyTuple_Empty;
		if (!PyArg_ParseTuple(args, "O\|O", &operation, &second_argument)) {
		goto error;
		}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is going on here? If multiple, then there is guaranteed to be something stuck in the pointer at second_argument, but otherwise ... there might be anyway? If that is really the API, I would appreciate some comments to that effect. If it isn't, then I would appreciate some comment explaning why the PyArg_ParseTuple(args, "O|O", &operation, &second_argument) isn't just PyArg_ParseTuple(args, "O", &operation).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_pysqlite_query_execute() is an implementation of two different but similar methods: execute() (multiple=0) and executemany() (multiple=1). The second argument of executemany() is an iterable, some operation is executed for every item from that iterable. The second argument of execute() is optional, some operation is executed for it if it is specified or for empty tuple if it is missed.

I think the code of _pysqlite_query_execute() can be simplified even more, but this is a different issue, not directly related to empty tuples.

		}
		else {
		coeff = PyTuple_New(0);
		if (coeff == NULL) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say I'm not a fan of this. Can you use a new static inline function in a header file PyTuple_EmptyNew() that cannot fail? And get rid of the assert()?

I don't like checking the return value based on whether it is PyTuple_New(0) or PyTuple_New(1).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think it is a much safer consistent idiom to document PyTuple_New as always needing its return value checked.

Have a separate new API call to get an empty tuple reference that guarantees the return value never needs to be checked. PyTuple_EmptyNew() as suggested perhaps. The difference between using an API call and directly referencing _PyTuple_Empty is the Py_INCREF being done for the caller.

Done this way, the singleton empty tuple can still be created on the heap by PyTuple_New(0) during interpreter initialization rather than being static.

		}
		else {
		coeff = PyTuple_New(0);
		if (coeff == NULL) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think it is a much safer consistent idiom to document PyTuple_New as always needing its return value checked.

Done this way, the singleton empty tuple can still be created on the heap by PyTuple_New(0) during interpreter initialization rather than being static.


		Py_DECREF(args);
		return retval;
		if (!PyArg_ParseTupleAndKeywords(_PyTuple_Empty, state, "\|$OOOOO", kwlist,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason not to define _PyTuple_Empty as a public API similar to Py_None and Py_True?

		p2cread, p2cwrite, c2pread, c2pwrite,
		errread, errwrite, errpipe_read, errpipe_write,
		close_fds, restore_signals, call_setsid,
		py_fds_to_keep, preexec_fn, preexec_fn_args_tuple);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

		PyTupleObject obj;
		} _PyTuple_EmptyStruct = {
		.gc_head = { .gc = {
		.gc_refs = (((size_t)_PyGC_REFS_UNTRACKED) << _PyGC_REFS_SHIFT)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

I would like to see this be a public API. PyTuple_Empty() is useful, easy to understand, and reads well in various code contexts.

I recommend this be closed.

Several commenters found things to not like about it.
It doesn't fix any bugs
It makes the C API slightly more complicated to learn and remember (right now, we have one way to do it).