Issue1394
Created on 2007-11-05 22:09 by _doublep, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unreachable-code.diff | _doublep, 2007-11-05 22:09 | |||
| test_peepholer.diff | belopolsky, 2008-02-25 23:37 | |||
| unreachable-code-1.diff | belopolsky, 2008-02-26 01:44 | diff against revision 61073 | ||
| unreachable-code-with-tests.diff | belopolsky, 2008-02-26 13:10 | diff against revision 61073 | ||
| Messages (19) | |||
|---|---|---|---|
| msg57141 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2007-11-05 22:09 | |
This patch improves bytecode output, by removing unreachable code. It doesn't target special cases, as now, but provides a generic implementation. |
|||
| msg62953 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-25 00:04 | |
I am not sure what this patch would accomplish. I tried
$ cat t.py
def f():
return 1
1+2
from dis import dis
print dis(f)
Both with and without patch I get
$ ./python.exe -O t.py
2 0 LOAD_CONST 1 (1)
3 RETURN_VALUE
3 4 LOAD_CONST 1 (1)
7 LOAD_CONST 2 (2)
10 BINARY_ADD
11 POP_TOP
None
I am sure I am missing something, but it is hard to tell what without
any use cases provided.
|
|||
| msg62964 - (view) | Author: Neal Norwitz (nnorwitz) * ![]() |
Date: 2008-02-25 02:30 | |
It would be great to see test cases with this change. That would help answer Alexander's question too. |
|||
| msg62992 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-25 19:09 | |
I figured that out:
>>> def g():
... raise RuntimeError
...
Before the patch:
>>> dis(g)
2 0 LOAD_GLOBAL 0 (RuntimeError)
3 RAISE_VARARGS 1
6 LOAD_CONST 0 (None)
9 RETURN_VALUE
After the patch:
>>> dis(g)
2 0 LOAD_GLOBAL 0 (RuntimeError)
3 RAISE_VARARGS 1
Looks reasonable to me. Paul, do you need help with unit tests?
|
|||
| msg62999 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2008-02-25 20:25 | |
Yes, help with unit tests would be appreciated. Especially since it is
not supposed to fix anything, so I'm not sure what unit tests should be
like...
BTW, trying to understand why the patch didn't remove unreachable code
in your first example, I noticed this (both cases are with patch):
def f1():
return 1
x()
disassembly:
3 0 LOAD_CONST 1 (1)
3 RETURN_VALUE
4 4 LOAD_GLOBAL 0 (x)
7 CALL_FUNCTION 0
10 POP_TOP
def f2():
return 1
x()
return 2
disassembly:
3 0 LOAD_CONST 1 (1)
3 RETURN_VALUE
Looking a bit further, I noticed this:
if (codestr[codelen-1] != RETURN_VALUE)
goto exitUnchanged;
So, the patch really can't help with your first example, because peephol
optimizer just bails out before real action begins.
|
|||
| msg63000 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2008-02-25 20:31 | |
Speaking of which, I propose that codestr[] array is made one byte
longer and RETURN_VALUE opcode wrote in that extra byte. It will be
removed by this patch anyway (if it is accepted), but we'll remove a way
to accidentally disable peephole optimizer.
I mean, it would cost almost nothing, yet will prevent making some
functions non-optimized. E.g. like this:
def foo(x):
if x >= 0:
return x * 2
raise ValueError
|
|||
| msg63004 - (view) | Author: Benjamin Peterson (benjamin.peterson) * ![]() |
Date: 2008-02-25 21:32 | |
>Yes, help with unit tests would be appreciated. Especially since it is >not supposed to fix anything, so I'm not sure what unit tests should be >like... Unit tests are just for bugfixes. They let us make sure Python is doing what we want it to do in a given case. Your unit test will probably have functions where you optimization should take effect and assert that it does. For starters, take a look at Lib/test/test_peephole.py |
|||
| msg63005 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2008-02-25 21:50 | |
On Mon, Feb 25, 2008 at 1:32 PM, Benjamin Peterson <report@bugs.python.org> wrote: > Unit tests are just for bugfixes. I presume you meant "are *not* just for bugfixes". |
|||
| msg63006 - (view) | Author: Benjamin Peterson (benjamin.peterson) * ![]() |
Date: 2008-02-25 22:05 | |
> I presume you meant "are *not* just for bugfixes". (Sigh) Yes, of course. |
|||
| msg63017 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-25 23:37 | |
Attached patch adds test_elim_unreachable() unit test. Last two
assertions should fail with unpatched python.
I am still trying to convince myself that the transformation are
correct.
> I propose that codestr[] array is made one byte
> longer and RETURN_VALUE opcode wrote in that extra byte.
I don't think that will be correct. Did you consider the following
comment?
/* Verify that RETURN_VALUE terminates the codestring. This
allows
the various transformation patterns to look ahead several
instructions without additional checks to make sure they are
not
looking beyond the end of the code string.
*/
if (codestr[codelen-1] != RETURN_VALUE)
goto exitUnchanged;
|
|||
| msg63019 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2008-02-26 00:00 | |
Thanks for writing the test. Yes, I did read the comment. As I understood it, RETURN_VALUE is needed only so that various optimization can assume codestr[] cannot suddenly end without one. E.g. if you match for BINARY_ADD, you can safely check the next command: if BINARY_ADD matched, there is a _guaranteed_ next command, not an out-of-array failure. Such proposed fake RETURN_VALUE _must_ be unreachable, so it must not be problematic at all. If it was reachable, real codestr[] would end in reachable non-return command, which must not happen during compilation. I dunno, maybe interpreter guards against execution point falling of the bytecode string, but in any case this must not happen in non-corrupted files generated by the bytecode compiler. |
|||
| msg63020 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-26 01:44 | |
Paul, You are right. I misunderstood that comment myself. I've tried the attached modification to your patch (unreachable-code-1.diff) and it passes all tests while fixing msg62953 problem: $ cat t.py def f(): return 1 1+2 from dis import dis dis(f) $ ./python.exe t.py 6 0 LOAD_CONST 0 (None) 3 RETURN_VALUE |
|||
| msg63029 - (view) | Author: Neal Norwitz (nnorwitz) * ![]() |
Date: 2008-02-26 05:58 | |
Can you add more tests? It seems that almost all the examples given in
this thread are good examples. Also something like:
while 1:
if cond: return 1
return 2
return 3
There are a bunch more cases that could be added for where the code
should be optimized or might be optimized wrong that.
The #if 0 code should be removed in the patch. Also, since
CONTINUE_LOOP is removed in this patch, it would be good to explicitly
add a test for that.
This patch looks good to me, but I'll let Raymond decide. Did all the
entire test suite run? Note, you must remove all the .pyc files to test
this patch or change the MAGIC number in Python/import.c.
|
|||
| msg63040 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-26 13:10 | |
I have added more tests as Neal suggested. See unreachable-code-with- tests.diff for a combined patch. I've rerun all tests after removing .pyc files and also with -O option (ensuring no existing .pyo files as well). All works. I've removed #if 0 block and changed "Verify" to "Ensure" in the RETURN_VALUE comment. Paul, maybe you and add an explanation why the added RETURN_VALUE will always be eliminated. I am not sure when python produces code with no terminating RETURN_VALUE in the first place. |
|||
| msg63043 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2008-02-26 14:29 | |
Well, I cannot guarantee it will _always_ be eliminated, since I too don't really know when generated bytecode can (currently) end in non-return. However, if before it ended in non-return, that non-return must have been unreachable, obviously (else code flow would fall off of the valid bytecode string). So, there is pretty good chance that this patch will remove both the unreachable code and the fake return byte. In the worst case, we increase bytecode length by 1 byte, but we always enable peephole optimizer, which can be expected to give a much more significant gain. As your unit tests suggest, the patch doesn't always remove unreachable code, because some opcodes after which this can be done are already handled differently. Also, there might in theory be some false block boundaries (e.g. a jump from unreachable code to another piece of such code). BTW, I think that just removing #if 0'd block is not correct. I propose to convert it to a comment then, just to explain that there are more opcodes after which we can expect unreachable code and don't check only because it'd give duplicate case labels. Maybe some goto magic can handle them, don't have sources here. |
|||
| msg63045 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2008-02-26 14:38 | |
Actually, it is even better. Since you don't increment codelen, that extra RETURN_VALUE is "virtual" in that it acts as a sentinel, but will not appear in the resulting bytecode. You can test this by backing out the patch and disassembling something. For this reason, if() before writing the RETURN_VALUE is not needed. |
|||
| msg63046 - (view) | Author: Alexander Belopolsky (belopolsky) * ![]() |
Date: 2008-02-26 15:25 | |
> Since you don't increment codelen .. Good catch! > For this reason, if() before > writing the RETURN_VALUE is not needed. In this case it will be clearer to use STOP_CODE instead of RETURN_VALUE as a sentinel. |
|||
| msg68709 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2008-06-24 23:12 | |
Recommend rejecting. Too much work for zero performance payoff. |
|||
| msg68710 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2008-06-24 23:20 | |
Rejecting. Too much risk as well, based on experience in the past with supposedly harmless optimizations. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:27 | admin | set | github: 45735 |
| 2008-06-24 23:20:57 | gvanrossum | set | status: open -> closed resolution: rejected messages: + msg68710 |
| 2008-06-24 23:12:06 | rhettinger | set | assignee: rhettinger -> messages: + msg68709 |
| 2008-02-26 15:25:19 | belopolsky | set | messages: + msg63046 |
| 2008-02-26 14:38:20 | _doublep | set | messages: + msg63045 |
| 2008-02-26 14:29:52 | _doublep | set | messages: + msg63043 |
| 2008-02-26 13:10:04 | belopolsky | set | files:
+ unreachable-code-with-tests.diff messages: + msg63040 |
| 2008-02-26 05:58:47 | nnorwitz | set | messages: + msg63029 |
| 2008-02-26 01:44:37 | belopolsky | set | files:
+ unreachable-code-1.diff messages: + msg63020 |
| 2008-02-26 00:00:07 | _doublep | set | messages: + msg63019 |
| 2008-02-25 23:37:28 | belopolsky | set | files:
+ test_peepholer.diff messages: + msg63017 |
| 2008-02-25 22:56:42 | rhettinger | set | assignee: rhettinger nosy: + rhettinger |
| 2008-02-25 22:05:48 | benjamin.peterson | set | messages: + msg63006 |
| 2008-02-25 21:50:15 | gvanrossum | set | messages: + msg63005 |
| 2008-02-25 21:32:07 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg63004 |
| 2008-02-25 20:31:16 | _doublep | set | messages: + msg63000 |
| 2008-02-25 20:25:49 | _doublep | set | messages: + msg62999 |
| 2008-02-25 19:09:33 | belopolsky | set | messages: + msg62992 |
| 2008-02-25 02:30:30 | nnorwitz | set | nosy:
+ nnorwitz messages: + msg62964 |
| 2008-02-25 00:04:25 | belopolsky | set | nosy:
+ belopolsky messages: + msg62953 |
| 2007-11-06 17:40:44 | gvanrossum | set | nosy: + gvanrossum |
| 2007-11-06 04:50:44 | loewis | set | keywords: + patch |
| 2007-11-05 22:09:01 | _doublep | create | |
