GH-90997: Improve inline cache performance for MSVC #96781

brandtbucher · 2022-09-12T22:37:23Z

This Compiler Explorer snippet suggests that MSVC doesn't optimize away our cache read/write utilities the way that Clang and GCC do. However, it also shows that replacing our current shifting implementations with memcpy calls does exactly what we want, in a standard-defined way, on all three compilers. I think the memcpy version is a bit nicer, especially since we don't need to maintain two differently-endian versions of the same code.

I'm curious if this moves the benchmarks at all, but I don't have a good Windows pyperformance setup figured out yet. So maybe somebody else could help me out with some numbers on this? (@gvanrossum, I think I remember that you were able to get kinda-stable MSVC numbers a while back?)

Issue: Inline bytecode caches #90997

gvanrossum · 2022-09-12T22:46:25Z

I'm reluctant to spend time getting stable benchmarks for this on Windows, last time it took me an afternoon to run the benchmarks several times with and without the patches. Let's wait until we have a Windows benchmarking machine, please?

markshannon

The changes in MSVC output are a clear improvement, and seems to have a positive effect elsewhere.
Even on RISC-V which doesn't support unaligned access, the code is a bit better: https://godbolt.org/z/Ydx5roTa7

Include/internal/pycore_code.h

brandtbucher · 2022-09-14T04:35:46Z

Looks like even GCC benefits on 64-bit ARM: https://godbolt.org/z/MTj3anj4c

bedevere-bot · 2022-09-14T05:05:45Z

🤖 New build scheduled with the buildbot fleet by @brandtbucher for commit 5985a4a 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

lpereira · 2022-09-14T16:42:14Z

LGTM

brandtbucher added 2 commits September 12, 2022 15:13

Use memcpy for cache reads/writes

eaa91ed

blurb add

ae8dfea

brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Sep 12, 2022

brandtbucher self-assigned this Sep 12, 2022

bedevere-bot added the awaiting core review label Sep 12, 2022

brandtbucher requested a review from markshannon September 12, 2022 22:38

markshannon reviewed Sep 13, 2022

View reviewed changes

Include/internal/pycore_code.h Outdated Show resolved Hide resolved

Update comment

5985a4a

brandtbucher added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 14, 2022

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 14, 2022

brandtbucher merged commit a83fdf2 into python:main Sep 15, 2022

bedevere-bot removed the awaiting core review label Sep 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-90997: Improve inline cache performance for MSVC #96781

GH-90997: Improve inline cache performance for MSVC #96781

brandtbucher commented Sep 12, 2022 •

edited

Loading

gvanrossum commented Sep 12, 2022

markshannon left a comment

brandtbucher commented Sep 14, 2022

bedevere-bot commented Sep 14, 2022

lpereira commented Sep 14, 2022

GH-90997: Improve inline cache performance for MSVC #96781

GH-90997: Improve inline cache performance for MSVC #96781

Conversation

brandtbucher commented Sep 12, 2022 • edited Loading

gvanrossum commented Sep 12, 2022

markshannon left a comment

Choose a reason for hiding this comment

brandtbucher commented Sep 14, 2022

bedevere-bot commented Sep 14, 2022

lpereira commented Sep 14, 2022

brandtbucher commented Sep 12, 2022 •

edited

Loading