Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-96421: Don't keep local copies of first_instr, names and consts in the interpreter. #96847

Closed

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Sep 15, 2022

I'm struggling to get good performance numbers on this. An earlier run gave me a 4% speedup, which is clearly wrong.

However, we can look at the stats and deduce that this should speed things up.

first_instr is never hot and can be computed cheaply as frame->f_code + OFFSET. Caching it is pointless.

The reasoning for names and consts is a bit more complex:

  • Approx 7% of instructions change the frame, this PR saves four loads (and if names and consts are spllled, two stores)
  • About 2% of instructions use names, this PR costs two loads, or one load if names is spilled.
  • About 8% of instructions use consts, this PR costs two loads, or one load if consts is spilled.

The saving is small if names and consts are not spilled, but a compiler would be stupid not to spill names and consts on x86-64.
So on RISC would expect a very small speedup, but on x86-64 we would expect a larger one. I would estimate in the 0.3-0.6% range.

Even if this results in a small slowdown, it should speedup #96319 and increase the potential benefit of adding LOAD_CONST_IMMORTAL

Skipping news as this change is undetectable, even for a intrusive C extension.

Chesterton's Fence, a.k.a why were these values stored in local variables before?

These variables used to speed up CPython, even they no longer do:

  • first_instr: Many jumps were absolute, so first_instr was needed to make these jumps.
  • names: LOAD_ATTR and LOAD_GLOBAL need names. Thanks to PEP 659, these instructions (and their adaptive forms) are much less common
  • consts: The existence of POP_JUMP_IF_NONE reduced the number of LOAD_CONSTs a bit, and RETURN_GENERATOR has increased the number of frame entry and exits. This may have tipped the balance, or maybe caching of consts was misguided before.

@markshannon
Copy link
Member Author

What I didn't consider is that the loads on resume are likely to have their latency hidden, but the loads in LOAD_ATTR, etc. will not. So this might result in a slowdown. More accurate benchmarking does show a bit of a slowdown.

@markshannon
Copy link
Member Author

Closing in favour of a more limited version

@markshannon markshannon closed this Oct 4, 2022
@markshannon markshannon deleted the dont-cache-names-and-constants branch September 26, 2023 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants