gh-93429: Merge `LOAD_METHOD` back into `LOAD_ATTR` by Fidget-Spinner · Pull Request #93430

gh-93429: Merge `LOAD_METHOD` back into `LOAD_ATTR` by Fidget-Spinner · Pull Request #93430 · python/cpython

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

For better specialization of LOAD_METHOD.
Cons:

All LOAD_ATTR slows down slightly due to additional SET_TOP and STACK_GROW.
Base LOAD_ATTR (unspecialized) gets one more branch.
All LOAD_ATTR cache becomes much bigger. We need to mitigate this with faster-cpython/ideas#396
EXTENDED_ARGS for LOAD_ATTR/METHOD in large modules

Pros:

Theoretically, what was previously LOAD_METHOD should now have better specialization because we can just treat it like specializing a normal attribute.
We save three opcodes.
We de-duplicate lots of code shared between LOAD_METHOD and LOAD_ATTR.

Fixes #93429.

~~One problem is that i still need to separately track specialization fails of the old LOAD_METHOD separately. Since LOAD_ATTR specialization is attempted after LOAD_METHOD fails.~~

I tried to maintain the diff as small as possible. Overall this saves us two opcodes and reduces the total lines of code in CPython.

TODO:

Docs needs updating in another PR.
~~I need to run pyperformance.~~

Fidget-Spinner changed the title ~~Merge LOAD_METHOD back into LOAD_ATTR~~ gh-93429: Merge LOAD_METHOD back into LOAD_ATTR

Jun 2, 2022

To be honest I'm shocked that the tests are passing 😆 .

An example of LOAD_ATTR specialization in action for something LOAD_METHOD couldn't specialize for (note the LOAD_ATTR_INSTANCE_VALUE !):

class X: ...

x = X()
x.a = lambda a:a
def z():
 for _ in range(10):
  x.a(1)

z()
dis.dis(z, adaptive=True)
  1           0 RESUME_QUICK             0

  2           2 LOAD_GLOBAL_ADAPTIVE     1 (NULL + range)
             14 LOAD_CONST               1 (10)
             16 CALL_ADAPTIVE            1
             26 GET_ITER
             28 FOR_ITER                26 (to 82)
             30 STORE_FAST               0 (_)

  3          32 LOAD_GLOBAL_MODULE       2 (x)
             44 LOAD_ATTR_INSTANCE_VALUE     5 (NULL|self + a)
             66 LOAD_CONST               2 (1)
             68 CALL_PY_EXACT_ARGS       1
             78 POP_TOP
             80 JUMP_BACKWARD_QUICK     27 (to 28)

  2     >>   82 LOAD_CONST               0 (None)
             84 RETURN_VALUE

I think that LOAD_METHOD should be removed from opcode.py and made a virtual opcode in compiler.c, like JUMP and POP_BLOCK

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

LOAD_ATTR represents about 5% of instructions, statically so adding 6 code units for each LOAD_ATTR will increase the size of the bytecode by about 10%. This seems high and is likely to cause some slowdowns due to cache misses and longer jumps.

Applying the size reductions in faster-cpython/ideas#396 would reduce that 10% increase to 5%.
I would suggest that we implement faster-cpython/ideas#396 first, so that we get a better idea of the performance impact of this change.
With faster-cpython/ideas#396, I would expect this to be a net improvement.

It might be helpful to draw up a table describing the various possible attributes and whether we specialize for them.
Something like this:

Owner	Attribute	LOAD_ATTR	LOAD_METHOD
object	in values	LOAD_ATTR_INSTANCE_VALUE	---
module	normal attribute	LOAD_ATTR_MODULE	LOAD_METHOD_MODULE
object	non-method on class	---	---
object	method on class, object has values	---	LOAD_METHOD_WITH_VALUES
class	attribute on class	---	LOAD_METHOD_CLASS

Feel free to edit this table, or create your own. Whatever suits you.

At the moment if I add your cache shrinking changes, the method and attr caches will become unaligned/out of sync and it's causing a few issues. Once you finish your shrinking I'll add all those changes in one fell swoop to save some trouble.

I have made the requested changes; please review again. To summarise:

The old LoadMethod specializing function now only deals with kind == METHOD.
LOAD_ATTR_CLASS now works for both methods and attributes.
Shrunk the cache using your changes.

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit 28ddf12 🤖

If you want to schedule another build, you need to add the "🔨 test-with-buildbots" label again.

For the record, all the buildbots pass, I'm going to fix some formatting issues now.

One quibble about the code layout. Otherwise, looks good.

I'm still concerned about the memory consumption of _PyLoadMethodCache, but we should fix that in another PR(s).

Just a follow up:

Now that this is merged. The base LOAD_ATTR/LOAD_METHOD is now slightly slower. This means good specialisation is all the more important to avoid that slowdown.