gh-93429: Merge `LOAD_METHOD` back into `LOAD_ATTR` by Fidget-Spinner · Pull Request #93430 · python/cpython

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Fidget-Spinner

For better specialization of LOAD_METHOD.
Cons:

  • All LOAD_ATTR slows down slightly due to additional SET_TOP and STACK_GROW.
  • Base LOAD_ATTR (unspecialized) gets one more branch.
  • All LOAD_ATTR cache becomes much bigger. We need to mitigate this with faster-cpython/ideas#396
  • EXTENDED_ARGS for LOAD_ATTR/METHOD in large modules

Pros:

  • Theoretically, what was previously LOAD_METHOD should now have better specialization because we can just treat it like specializing a normal attribute.
  • We save three opcodes.
  • We de-duplicate lots of code shared between LOAD_METHOD and LOAD_ATTR.

Fixes #93429.

@Fidget-Spinner

One problem is that i still need to separately track specialization fails of the old LOAD_METHOD separately. Since LOAD_ATTR specialization is attempted after LOAD_METHOD fails.

I tried to maintain the diff as small as possible. Overall this saves us two opcodes and reduces the total lines of code in CPython.

TODO:

  • Docs needs updating in another PR.
  • I need to run pyperformance.

@Fidget-Spinner Fidget-Spinner changed the title Merge LOAD_METHOD back into LOAD_ATTR gh-93429: Merge LOAD_METHOD back into LOAD_ATTR

Jun 2, 2022

@Fidget-Spinner

To be honest I'm shocked that the tests are passing 😆 .

@Fidget-Spinner

An example of LOAD_ATTR specialization in action for something LOAD_METHOD couldn't specialize for (note the LOAD_ATTR_INSTANCE_VALUE !):

class X: ...

x = X()
x.a = lambda a:a
def z():
 for _ in range(10):
  x.a(1)

z()
dis.dis(z, adaptive=True)
  1           0 RESUME_QUICK             0

  2           2 LOAD_GLOBAL_ADAPTIVE     1 (NULL + range)
             14 LOAD_CONST               1 (10)
             16 CALL_ADAPTIVE            1
             26 GET_ITER
             28 FOR_ITER                26 (to 82)
             30 STORE_FAST               0 (_)

  3          32 LOAD_GLOBAL_MODULE       2 (x)
             44 LOAD_ATTR_INSTANCE_VALUE     5 (NULL|self + a)
             66 LOAD_CONST               2 (1)
             68 CALL_PY_EXACT_ARGS       1
             78 POP_TOP
             80 JUMP_BACKWARD_QUICK     27 (to 28)

  2     >>   82 LOAD_CONST               0 (None)
             84 RETURN_VALUE

@Fidget-Spinner

markshannon

I think that LOAD_METHOD should be removed from opcode.py and made a virtual opcode in compiler.c, like JUMP and POP_BLOCK

@bedevere-bot

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@markshannon

LOAD_ATTR represents about 5% of instructions, statically so adding 6 code units for each LOAD_ATTR will increase the size of the bytecode by about 10%. This seems high and is likely to cause some slowdowns due to cache misses and longer jumps.

Applying the size reductions in faster-cpython/ideas#396 would reduce that 10% increase to 5%.
I would suggest that we implement faster-cpython/ideas#396 first, so that we get a better idea of the performance impact of this change.
With faster-cpython/ideas#396, I would expect this to be a net improvement.

@markshannon

It might be helpful to draw up a table describing the various possible attributes and whether we specialize for them.
Something like this:

Owner Attribute LOAD_ATTR LOAD_METHOD
object in values LOAD_ATTR_INSTANCE_VALUE ---
module normal attribute LOAD_ATTR_MODULE LOAD_METHOD_MODULE
object non-method on class --- ---
object method on class, object has values --- LOAD_METHOD_WITH_VALUES
class attribute on class --- LOAD_METHOD_CLASS

Feel free to edit this table, or create your own. Whatever suits you.

@Fidget-Spinner

At the moment if I add your cache shrinking changes, the method and attr caches will become unaligned/out of sync and it's causing a few issues. Once you finish your shrinking I'll add all those changes in one fell swoop to save some trouble.

@Fidget-Spinner

I have made the requested changes; please review again. To summarise:

  • The old LoadMethod specializing function now only deals with kind == METHOD.
  • LOAD_ATTR_CLASS now works for both methods and attributes.
  • Shrunk the cache using your changes.

@bedevere-bot

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

@bedevere-bot

🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit 28ddf12 🤖

If you want to schedule another build, you need to add the "🔨 test-with-buildbots" label again.

@Fidget-Spinner

For the record, all the buildbots pass, I'm going to fix some formatting issues now.

markshannon

@markshannon

One quibble about the code layout. Otherwise, looks good.

I'm still concerned about the memory consumption of _PyLoadMethodCache, but we should fix that in another PR(s).

@Fidget-Spinner

Just a follow up:

Now that this is merged. The base LOAD_ATTR/LOAD_METHOD is now slightly slower. This means good specialisation is all the more important to avoid that slowdown.