bpo-39320: Fix changes in the evaluation logic of unpacking operations. by brandtbucher · Pull Request #18264 · python/cpython

@brandtbucher

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking operations. Previously, all elements were evaluated prior to being collected in a container. Now, these operations are interleaved.

For example, the code [*foo, print("XXX")] used to print "XXX" before calling foo.__iter__. This is no longer the case.

I've included several similar regression tests in this PR. @markshannon, your intuition about keeping the opargs for the DICT_UPDATE/DICT_MERGE ops was good.

https://bugs.python.org/issue39320

@brandtbucher brandtbucher changed the title bpo-39320: Fix regressions in the evaluation logic of unpacking operations. bpo-39320: Fix changes in the evaluation logic of unpacking operations.

Jan 30, 2020

@codecov

@markshannon

According to https://docs.python.org/3/reference/expressions.html#evaluation-order expressions within a list are evaluated left to right. So, [*None, print("XXX")] should not print "XXX" as *None should be evaluated first and raise a TypeError.
Also compare with [+None, print("XXX")], [None[:], print("XXX")] and [None(), print("XXX")]

In the expression [a, b] a should be fully evaluated before b.

The same rationale applies to ** unpacking.

@brandtbucher

I don't think I agree.

According to https://docs.python.org/3/reference/expressions.html#evaluation-order expressions within a list are evaluated left to right.

In the expression [a, b] a should be fully evaluated before b.

Absolutely, 100%. And in [*a, *b], the same holds (a before b). *a is not an expression, though!

"Unary *" isn't an operator; it doesn't appear on the operator precedence table at the link you provided, and you can't evaluate *x. Like the brackets and the comma, it's part of the syntax of the outer display expression, not the inner one. It specifies how the list should be built, so it should be evaluated last, as part of the list construction (as it always has been).

I think one of the examples in the link you provided is the most clarifying here:

In the following lines, expressions will be evaluated in the arithmetic order of their suffixes:

...
expr1(expr2, expr3, *expr4, **expr5)
...

Note that the stars are not part of expressions 1-5, but are a part of the top-level call expression that operates on them all.

@markshannon

@brandtbucher

Why do you say that *expr is not an expression? The grammar calls it a "star_expr"...

Hm, I've always read that as "starred expression" (an expression that has been "starred"), as in:

>>> *expr
  File "<stdin>", line 1
SyntaxError: can't use starred expression here

Regardless, this is a breaking semantic change, and one that is so subtle it's going to be a huge pain to debug for the few people it hits. Imagine you had the following working code in 3.8:

>>> def rotate(stuff):
...     i = iter(stuff)
...     return [*i, next(i)]
... 
>>> rotate(2*i for i in range(5))
[2, 4, 6, 8, 0]

Sure it's ugly, but it's worked the same since PEP 448 was first landed. Then you upgrade to 3.9, and some time later this happens:

>>> rotate(2*i for i in range(5))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in rotate
StopIteration

Good luck figuring that one out.

@brandtbucher

I didn't realize this would be such a controversial fix though. If you still disagree @markshannon, I can open a thread on python-dev explaining my reasoning, and you can reply with what you feel the behavior should be, why, why it's worth changing, etc...?

@markshannon

In your example, I would be very surprised to find that next(i) extracts the first value of i, even though it follows *i.

I think a discussion on Python-dev is the way forward.

@markshannon

Why is "expression that has been starred" not an an expression? -expr is an "expression that has been negated" and is still an expression.

@brandtbucher

Sounds good, I'll open a thread there later today.

@terryjreedy

Brandt's pydev thread, 2020/2/6, is "Clarification of unpacking semantics." I read his post and the discussion above and agree that the semantic change is a mistake. My long response is on pydev.

Why is "expression that has been starred" not an an expression?
Because it is a 'starred_item', which may be either an expression or a starred 'or_expr'.
https://docs.python.org/3/reference/expressions.html#expression-lists
*a and *b in this discussion are the latter. Unlike expressions, they cannot be evaluated to a value (or specifically, a python object).