Replace a call to PyTuple_New with _PyTuple_FromArraySteal #96516

kmod · 2022-09-02T17:54:41Z

PyTuple_New will zero out the tuple before returning to the caller, and a
surprising amount of time can be saved by not doing this zeroing. One option
is to add a non-zeroing version of PyTuple_New, which I did in #96446, but
there was resistance to the unsafety of it.

Fortunately it looks like most of the tuple-zeroing happens directly from the
BUILD_TUPLE opcode in the interpreter, which already has the arguments in an
appropriate array, so we can just convert this to _PyTuple_FromArraySteal

This seems to result in a ~0.2% speedup on macrobenchmarks.

PyTuple_New will zero out the tuple before returning to the caller, and a surprising amount of time can be saved by not doing this zeroing. One option is to add a non-zeroing version of PyTuple_New, which I did in python#96446, but there was resistance to the unsafety of it. Fortunately it looks like most of the tuple-zeroing happens directly from the BUILD_TUPLE opcode in the interpreter, which already has the arguments in an appropriate array, so we can just convert this to _PyTuple_FromArraySteal This seems to result in a ~0.2% speedup on macrobenchmarks.

mdboom · 2022-09-02T19:59:13Z

This is a great small change, especially if it ends up providing most of the benefit of #96446.

I'm running this PR on the Faster CPython team's standard benchmarking machine as well to get another data point (though I may not be able to report back until after the long weekend).

mdboom · 2022-09-02T21:26:39Z

I measured only a 0.07% speedup on the pyperformance and pyston macrobenchmarks. I'm going to run this (and the baseline) again, because it's surprising it's so little. I don't want to imply that's an argument against this PR (and it's not really up to me anyway). Reproducibility of benchmarks in general, especially when the margins are so small, is an ongoing challenge we're all struggling with.

kmod · 2022-09-03T02:40:58Z

Yeah I think that even with some changes to make benchmarking more stable (bolt + longer bolt task) the systematic errors are around this level so I think it's hard to rely on the benchmarking numbers too much.

kmod · 2022-09-13T19:08:02Z

Sorry I'm not super familiar with the cpython workflow, is there anything I can do to help progress this through the "awaiting_merge" state?

methane · 2022-09-14T03:42:30Z

This improvement is very small to users. I think we don't need news file for this.

$ ./python-nonzero2 -m pyperf timeit --duplicate 10 --compare-to ./python --python-names master:patched -s 'x=42' -- '(1,2,x)'
master: ..................... 29.1 ns +- 0.1 ns
patched: ..................... 29.3 ns +- 0.3 ns

Mean +- std dev: [master] 29.1 ns +- 0.1 ns -> [patched] 29.3 ns +- 0.3 ns: 1.01x slower

$ ./python-nonzero2 -m pyperf timeit --duplicate 10 --compare-to ./python --python-names master:patched -s 'x=42' -- '(1,2,3,4,5,6,7,8,9,x)'
master: ..................... 56.0 ns +- 0.5 ns
patched: ..................... 55.5 ns +- 1.3 ns

Mean +- std dev: [master] 56.0 ns +- 0.5 ns -> [patched] 55.5 ns +- 1.3 ns: 1.01x faster

kmod · 2022-09-14T14:13:24Z

thanks!

kmod requested a review from markshannon as a code owner September 2, 2022 17:54

bedevere-bot added the awaiting review label Sep 2, 2022

kmod mentioned this pull request Sep 2, 2022

Add _PyTuple_New_Nonzeroed #96446

Closed

methane approved these changes Sep 7, 2022

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Sep 7, 2022

methane added skip news skip issue labels Sep 14, 2022

methane merged commit 4781535 into python:main Sep 14, 2022

bedevere-bot removed the awaiting merge label Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace a call to PyTuple_New with _PyTuple_FromArraySteal #96516

Replace a call to PyTuple_New with _PyTuple_FromArraySteal #96516

kmod commented Sep 2, 2022

mdboom commented Sep 2, 2022

mdboom commented Sep 2, 2022

kmod commented Sep 3, 2022

kmod commented Sep 13, 2022

methane commented Sep 14, 2022

kmod commented Sep 14, 2022

Replace a call to PyTuple_New with _PyTuple_FromArraySteal #96516

Replace a call to PyTuple_New with _PyTuple_FromArraySteal #96516

Conversation

kmod commented Sep 2, 2022

mdboom commented Sep 2, 2022

mdboom commented Sep 2, 2022

kmod commented Sep 3, 2022

kmod commented Sep 13, 2022

methane commented Sep 14, 2022

kmod commented Sep 14, 2022