Issue 9011: ast_for_factor unary minus optimization changes AST
Created on 2010-06-16 15:38 by alexhsamuel, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (24)
msg107928 - (view)
Author: Alex Samuel (alexhsamuel)
Date: 2010-06-16 15:38
Date: 2010-06-17 09:21
Date: 2010-06-17 11:15
Date: 2010-06-17 11:36
Date: 2010-06-17 12:40
Date: 2010-06-17 13:31
Date: 2010-06-20 14:29
Date: 2010-06-30 10:21
Date: 2010-06-30 10:35
Date: 2010-06-30 10:56
Date: 2010-06-30 11:18
Date: 2010-10-18 22:32
Date: 2010-12-17 03:42
Date: 2010-12-17 15:03
Date: 2010-12-17 15:16
Date: 2012-10-18 12:04
Date: 2012-10-18 12:18
Date: 2012-10-18 12:22
Date: 2012-10-18 12:31
Date: 2012-11-25 16:00
Date: 2012-11-25 17:11
The unary negative optimization in ast_for_factor() seems to modify the ST in a way changes its meaning.
In Python 3.1.2, the ST is no longer compilable:
$ cat exprbug.py
import parser
st = parser.expr("-3")
print(st.totuple())
compiled = st.compile()
print(eval(compiled))
print(st.totuple())
print(eval(st.compile()))
$ ~/sw/Python-3.1.2/bin/python3 exprbug.py
(258, (326, (301, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (317, (15, '-'), (317, (318, (319, (2, '3')))))))))))))))))), (4, ''), (0, ''))
-3
(258, (326, (301, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (317, (15, '-'), (317, (318, (319, (2, '-3')))))))))))))))))), (4, ''), (0, ''))
Traceback (most recent call last):
File "exprbug.py", line 8, in <module>
print(eval(st.compile()))
ValueError: could not convert string to float: --3
In earlier versions of Python (I have confirmed 2.5 and 2.6), it is compiled to incorrect code and produces wrong results when evaluated:
$ ~/sw/Python-2.6.2/bin/python exprbug.py
(258, (327, (304, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (15, '-'), (316, (317, (318, (2, '3'))))))))))))))))), (4, ''), (0, ''))
-3
(258, (327, (304, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (15, '-'), (316, (317, (318, (2, '-3'))))))))))))))))), (4, ''), (0, ''))
-1.0
If I remove the big if statement from the front of ast_to_factor(), the code behaves correctly. I think this is because STR(pnum) is changed in place and never restored to its previous value.
msg107998 - (view)
Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-17 09:21
Confirmed in trunk and py3k, both of which raise ValueError on the second compile:
Python 2.7rc1+ (trunk:82042, Jun 17 2010, 09:52:12)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import parser
>>> st = parser.expr("-3")
>>> st.compile()
<code object <module> at 0x450448, file "<syntax-tree>", line 1>
>>> st.compile()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: --3
This looks nasty; unless there's some rule that says that you're not supposed to hang on to AST objects after compiling them. (Is there?)
Benjamin, would it be worth just getting rid of this optimization for 2.7, and trying to reinstate a correct version for 2.7.1?
[Removing 2.5 from the versions list since there's nothing we can do about it there (2.5 is only getting security fixes at this point).]
See also issue 1441486.
msg108001 - (view)
Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-17 11:15
N.B. That if block isn't pure optimization: removing it gives a minor change in behaviour: Currently (Python 2.7, on a 64-bit machine): >>> -9223372036854775808 -9223372036854775808 And with the optimization removed: >>> -9223372036854775808 -9223372036854775808L I actually consider the second behaviour more correct than the first, since it follows clearly from the language rules (numeric literals have no sign, so the above *should* be interpreted as the unary minus operator applied to a literal, and that literal really is a PyLong). But obviously the contributors to issue 1441486 either disagree, or didn't want to introduce a regression from 2.4. I still consider that removing that if block is the right thing to do for 2.7. The change in behaviour really shouldn't affect any reasonable code---anywhere that an int is acceptable, a long should be too. Neil, any comments?msg108002 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-17 11:36
Results from Jython: newton:jython2.5.1 dickinsm$ ./jython Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54) [Java HotSpot(TM) 64-Bit Server VM (Apple Inc.)] on java1.6.0_20 Type "help", "copyright", "credits" or "license" for more information. >>> -2147483648 -2147483648L On the other hand, IronPython appears to detect this special case and produces an int instead of a long.msg108007 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-17 12:40
Fixed in r82043 (py3k) and r82044 (release31-maint), simply by removing the relevant 'if' block. For 2.x, Antoine (on IRC) pointed out that there might well be third party code that depends on -0x80000000 being an int rather than a long (32-bit machine), so changing that behaviour could cause breakage. Can anyone propose a fix that doesn't change behaviour?msg108011 - (view) Author: Alex Samuel (alexhsamuel) Date: 2010-06-17 13:04
How about saving the original value of STR(pnum) and restoring it after calling ast_for_atom()? This is not thread-safe, but I don't understand Python's threading model well enough to know whether the GIL is held in this function. On 6/17/2010 8:40 AM, Mark Dickinson wrote: > > Mark Dickinson<dickinsm@gmail.com> added the comment: > > Fixed in r82043 (py3k) and r82044 (release31-maint), simply by removing the relevant 'if' block. > > For 2.x, Antoine (on IRC) pointed out that there might well be third party code that depends on -0x80000000 being an int rather than a long (32-bit machine), so changing that behaviour could cause breakage. > > Can anyone propose a fix that doesn't change behaviour? > > ---------- > > _______________________________________ > Python tracker<report@bugs.python.org> > <http://bugs.python.org/issue9011> > _______________________________________msg108014 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-17 13:31
That sounds like a reasonable quick fix. Here's a patch.msg108237 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-20 14:29
Summary of brief discussion on #python-dev: - This is a hacky patch; it would be better to rewrite that portion of ast.c in a way that doesn't modify the original CST at all. With 2.7 so close, I daren't try anything more invasive than this patch, though. (I don't know whether there are any other portions of this file that deliberately change the CST.) - Given that no-one noticed this problem in 5 years, the fix can probably wait until 2.7.1, when it can be done properly.msg108969 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-30 10:21
It turns out that removing this code from py3k wasn't a no-op. It changed the semantics for complex literals. In Python 3.1.2: >>> -7j -7j >>> (-7j).real 0.0 But in current release31-maint branch, and in 3.2a0: Python 3.2a0 (py3k:82347M, Jun 28 2010, 22:25:11) [GCC 4.2.1 (Apple Inc. build 5659)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> -7j (-0-7j) >>> (-7j).real -0.0 So IMO r82044 (the release31-maint branch) should be reverted, to avoid nasty surprises when people upgrade from 3.1.2 to 3.1.3. But I'd like to keep this change in py3k: I consider that the 3.1 behaviour is buggy, and that the 3.2 behaviour is more correct (strange though it may seem at first sight), so I'd prefer to leave the current py3k behaviour as is, and add a Misc/NEWS entry describing the change if necessary. Assigning this to me because I don't want it to get forgotten. But if anyone else is interested in producing a patch for the cleanup of the CST->AST code, please do!msg108970 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-30 10:35
r82044 reverted in r82389.msg108971 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-30 10:56
Added tests for the current Python < 3.2 treatment of imaginary literals in r82390 (release31-maint). These tests should be backported to trunk, but I'll wait until after the release of 2.7 for that.msg108973 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-06-30 11:18
Added tests for Python 3.2's corrected treatment of negated imaginary literals in r82391, along with a Misc/NEWS entry indicating the changed behaviour. Restored version numbers for this issue, which I seem to have accidentally deleted earlier. (Including 3.1, since it could also benefit from the same cleanup.)msg119089 - (view) Author: Neil Schemenauer (nascheme) *
Date: 2010-10-18 22:32
I'm late to the party but looks like Mark has a good handle on the issues. I would recommend not changing behavior in a bugfix release. I'm pretty sure code "in the wild" would be broken.msg124180 - (view) Author: R. David Murray (r.david.murray) *
Date: 2010-12-17 03:42
Mark, are you still planning to do something for 3.1/2.7, or should this be closed?msg124206 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2010-12-17 15:03
Yes, sorry; I'm not likely to find time to do anything with this. Unassigning, and downgrading priority. Is it worth leaving this open in case anyone wants to do something about it?msg124209 - (view) Author: R. David Murray (r.david.murray) *
Date: 2010-12-17 15:16
Given the long projected lifetime of 2.7, I suppose it is.msg173258 - (view) Author: Antonio Cuni (antocuni) * Date: 2012-10-18 12:02
there is still an inconsistency in handling negative imaginary literals:
>>> -1j.real
-0.0
>>> complex('-1j').real
0.0
msg173259 - (view)
Author: Mark Dickinson (mark.dickinson) *
Date: 2012-10-18 12:04
No, that's expected behaviour. 1j is complex(0.0, 1.0) -1j is complex(-0.0, -1.0) so -1j.real is -0.0. There's not really any other sensible way to handle this. The complex-from-string constructor, on the other hand, is more careful about interpreting signs.msg173260 - (view) Author: Antonio Cuni (antocuni) * Date: 2012-10-18 12:13
I would say that the complex-from-string constructor should be fixed to handle this special case "correctly". I find very confusing that we get a different result whether we use a string literal or not. For example, in pypy we use the same code for parsing literals and converting strings, so you get -0.0 in both cases.msg173261 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2012-10-18 12:18
With the string, the minus sign applies only to the imaginary part; with the expression '-1j', it applies to the whole complex number (both real and imaginary parts). I don't see any sensible way to 'fix' the string to complex conversion; indeed, I think any change would make it worse than before. It's a known issue with complex arithmetic that x + 1j*y doesn't give you complex(x, y); the conversions from string and the complex(x, y) form are there to make it possible to carefully create a complex number with known real and imaginary parts. > For example, in pypy we use the same code for parsing literals and > converting strings, so you get -0.0 in both cases. But -1j isn't a literal. It's unary minus applied to a the complex number given by the literal '1j'. Python's code *does* give the same results both for converting strings and parsing literals.msg173262 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *
Date: 2012-10-18 12:22
Indeed, -1j is not a literal:
>>> dis.dis(lambda :-1j.real)
1 0 LOAD_CONST 0 (1j)
3 LOAD_ATTR 0 (real)
6 UNARY_NEGATIVE
7 RETURN_VALUE
msg173263 - (view)
Author: Mark Dickinson (mark.dickinson) *
Date: 2012-10-18 12:31
And -1j.real is -(1j.real), of course, not (-1j).real. My bad.msg176368 - (view) Author: Mark Dickinson (mark.dickinson) *
Date: 2012-11-25 16:00
New patch.msg176375 - (view) Author: Roundup Robot (python-dev)
Date: 2012-11-25 17:11
New changeset ea36de7926f8 by Mark Dickinson in branch '2.7': Issue #9011: AST creation no longer modifies CST for negated numeric literals. http://hg.python.org/cpython/rev/ea36de7926f8
History
Date
User
Action
Args
2022-04-11 14:57:02adminsetgithub: 53257
2012-11-25 17:12:08mark.dickinsonsetstatus: open -> closed
resolution: fixed 2012-11-25 17:11:43python-devsetnosy: + python-dev
messages: + msg176375
2012-11-25 16:00:42mark.dickinsonsetfiles: + issue9011_v2.patch
assignee: mark.dickinson
messages: + msg176368
2012-11-25 15:18:18mark.dickinsonsetversions: - Python 2.6, Python 3.1 2012-10-18 12:31:49mark.dickinsonsetmessages: + msg173263 2012-10-18 12:22:20amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg173262
2012-10-18 12:18:24mark.dickinsonsetmessages: + msg173261 2012-10-18 12:13:35antocunisetmessages: + msg173260 2012-10-18 12:04:01mark.dickinsonsetmessages: + msg173259 2012-10-18 12:02:18antocunisetnosy: + antocuni
messages: + msg173258
2010-12-17 15:16:49r.david.murraysetnosy: nascheme, mark.dickinson, benjamin.peterson, alexhsamuel, r.david.murray, meador.inge
messages: + msg124209 2010-12-17 15:03:14mark.dickinsonsetpriority: high -> normal
versions: + Python 2.6, Python 3.1, Python 2.7 2010-06-30 10:56:53mark.dickinsonsetmessages: + msg108971 2010-06-30 10:35:11mark.dickinsonsetmessages: + msg108970 2010-06-30 10:21:42mark.dickinsonsetassignee: mark.dickinson
messages: + msg108969
versions: - Python 2.6, Python 2.7 2010-06-21 12:42:56meador.ingesetnosy: + meador.inge
2010-06-20 14:29:24mark.dickinsonsetmessages: + msg108237 2010-06-17 13:31:42mark.dickinsonsetfiles: + issue9011.patch
keywords: + patch
messages: + msg108014
nosy: + nascheme
messages: + msg108001 2010-06-17 09:29:21mark.dickinsonsetversions: - Python 2.5 2010-06-17 09:21:03mark.dickinsonsetmessages: + msg107998
versions: + Python 2.7, Python 3.2 2010-06-17 08:49:09mark.dickinsonsetnosy: + mark.dickinson
2010-06-17 01:03:41r.david.murraysetnosy: + benjamin.peterson
2010-06-16 15:38:58alexhsamuelcreate
resolution: fixed 2012-11-25 17:11:43python-devsetnosy: + python-dev
messages: + msg176375
2012-11-25 16:00:42mark.dickinsonsetfiles: + issue9011_v2.patch
assignee: mark.dickinson
messages: + msg176368
2012-11-25 15:18:18mark.dickinsonsetversions: - Python 2.6, Python 3.1 2012-10-18 12:31:49mark.dickinsonsetmessages: + msg173263 2012-10-18 12:22:20amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg173262
2012-10-18 12:18:24mark.dickinsonsetmessages: + msg173261 2012-10-18 12:13:35antocunisetmessages: + msg173260 2012-10-18 12:04:01mark.dickinsonsetmessages: + msg173259 2012-10-18 12:02:18antocunisetnosy: + antocuni
messages: + msg173258
2010-12-17 15:16:49r.david.murraysetnosy: nascheme, mark.dickinson, benjamin.peterson, alexhsamuel, r.david.murray, meador.inge
messages: + msg124209 2010-12-17 15:03:14mark.dickinsonsetpriority: high -> normal
messages:
+ msg124206
assignee: mark.dickinson -> (no value)
nosy:
nascheme, mark.dickinson, benjamin.peterson, alexhsamuel, r.david.murray, meador.inge
messages:
+ msg124180
stage: commit review -> needs patch
versions: + Python 2.6, Python 3.1, Python 2.7 2010-06-30 10:56:53mark.dickinsonsetmessages: + msg108971 2010-06-30 10:35:11mark.dickinsonsetmessages: + msg108970 2010-06-30 10:21:42mark.dickinsonsetassignee: mark.dickinson
messages: + msg108969
versions: - Python 2.6, Python 2.7 2010-06-21 12:42:56meador.ingesetnosy: + meador.inge
2010-06-20 14:29:24mark.dickinsonsetmessages: + msg108237 2010-06-17 13:31:42mark.dickinsonsetfiles: + issue9011.patch
keywords: + patch
messages: + msg108014
stage: commit review
2010-06-17 13:04:13alexhsamuelsetmessages: + msg108011 2010-06-17 12:40:27mark.dickinsonsetversions: - Python 3.1, Python 3.2 2010-06-17 12:40:18mark.dickinsonsetmessages: + msg108007 2010-06-17 11:36:09mark.dickinsonsetmessages: + msg108002 2010-06-17 11:15:59mark.dickinsonsetpriority: normal -> highnosy: + nascheme
messages: + msg108001 2010-06-17 09:29:21mark.dickinsonsetversions: - Python 2.5 2010-06-17 09:21:03mark.dickinsonsetmessages: + msg107998
versions: + Python 2.7, Python 3.2 2010-06-17 08:49:09mark.dickinsonsetnosy: + mark.dickinson
2010-06-17 01:03:41r.david.murraysetnosy: + benjamin.peterson
2010-06-16 15:38:58alexhsamuelcreate