Victor/vstinner: Isn't PR 12032 reintroducing the issue fixed in #29234? _PyStack_AsTuple was explicitly marked as _Py_NO_INLINE because inlining was creating excessive stack consumption in the callers (which were the bytecode interpreter loop), but the new _PyTuple_FromArray isn't marked as _Py_NO_INLINE, so the swap reintroduces the problem.
Seems like either:
1. _PyTuple_FromArray should also be marked _Py_NO_INLINE
or
2. _PyStack_AsTuple should continue to exist as the non-inlined version of _PyTuple_FromArray
It's possible this isn't as much of an issue because _PyTuple_FromArray is in tupleobject.c (where it's only called in one place), while _PyStack_AsTuple was in call.c and was called from within call.c in four places, but only if link-time optimization isn't involved (and in practice, most public distributions of CPython are built with link-time optimization now, correct?); if LTO is enabled, the same stack bloat issues are possible. |