Exhaustiveness: allocate memory better #118490

Nadrieril · 2023-12-01T00:26:31Z

Exhaustiveness is a recursive algorithm that allocates a bunch of slices at every step. Let's see if I can improve performance by improving allocations.

Already just using Vec::with_capacity is showing impressive improvements on my local measurements.

r? @ghost

Nadrieril · 2023-12-01T00:27:33Z

@bors try @rust-timer queue

bors · 2023-12-01T00:28:43Z

⌛ Trying commit ab2a92a with merge 73621b2...

[Experiment] Exhaustiveness: allocate memory better Exhaustiveness is a recursive algorithm that allocates a bunch of slices at every step. Let's see if I can improve performance by improving allocations. Already just using `Vec::with_capacity` is showing impressive improvements on my local measurements. r? `@ghost`

bors · 2023-12-01T01:55:37Z

☀️ Try build successful - checks-actions
Build commit: 73621b2 (73621b2741228d97b378a2824e61f1798b8634f5)

rust-timer · 2023-12-01T05:19:56Z

Finished benchmarking commit (73621b2): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.4%	[0.3%, 0.4%]	6
Improvements ✅ (primary)	-0.2%	[-0.2%, -0.2%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.2%	[-0.2%, -0.2%]	1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[2.0%, 2.0%]	1
Regressions ❌ (secondary)	2.4%	[1.5%, 5.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.0%	[2.0%, 2.0%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.7%	[3.0%, 4.3%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 674.191s -> 671.744s (-0.36%)
Artifact size: 313.41 MiB -> 313.42 MiB (0.00%)

Nadrieril · 2023-12-01T06:29:18Z

I am so disappointed x) I had >5% improvements on cycles for unicode_normalization and html5ever in my local measurements

Nadrieril · 2023-12-01T08:49:01Z

Alright, more ideas

@bors try @rust-timer queue

bors · 2023-12-01T08:50:12Z

⌛ Trying commit eee8baf with merge c0020fd...

[Experiment] Exhaustiveness: allocate memory better Exhaustiveness is a recursive algorithm that allocates a bunch of slices at every step. Let's see if I can improve performance by improving allocations. Already just using `Vec::with_capacity` is showing impressive improvements on my local measurements. r? `@ghost`

bors · 2023-12-01T10:16:34Z

☀️ Try build successful - checks-actions
Build commit: c0020fd (c0020fdde2eef42add3b6df60246f741f394435a)

bors · 2023-12-01T10:16:34Z

☀️ Try build successful - checks-actions
Build commit: c0020fd (c0020fdde2eef42add3b6df60246f741f394435a)

the8472 · 2023-12-01T11:47:04Z

I am so disappointed x) I had >5% improvements on cycles for unicode_normalization and html5ever in my local measurements

Make sure to enable jemalloc when testing allocation-related stuff locally.

rust-timer · 2023-12-01T11:55:32Z

Finished benchmarking commit (c0020fd): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.5%	[-3.8%, -3.2%]	6
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.8%	[4.8%, 4.8%]	1
Improvements ✅ (primary)	-2.5%	[-2.5%, -2.5%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.5%	[-2.5%, -2.5%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	6.7%	[5.5%, 8.4%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 672.931s -> 673.492s (0.08%)
Artifact size: 313.36 MiB -> 313.39 MiB (0.01%)

Nadrieril · 2023-12-01T12:51:51Z

Yep, using jemalloc. That's what I got for this commit 🥲 (on u:cycles)

Is it that CI uses a very different architecture? Or that PGO correctly predicted the allocations in such a way that my changes don't matter? We'll never know...

Nadrieril · 2023-12-01T16:52:35Z

@nnethercote what do you think of this? I seem to be doing something right but CI perf is totally unmoved by it (ignore match-stress, it naturally doesn't like that I over-allocate in pop_head_constructor; I should remove that).

nnethercote · 2023-12-01T22:55:48Z

I usually assume differences in local perf vs CI perf are de to PGO and/or BOLT. It can be opaque, certainly.

Are you on Linux? If so, you can measure allocations with DHAT, which can be useful.

Nadrieril · 2023-12-02T02:05:03Z

Hm, I have a pretty clear idea of allocation behavior; what I find difficult is to know what a typical match statement looks like so I can optimize for that. Since CI can't help me, I'll just keep the changes that are unambiguously good.

Nadrieril · 2023-12-04T01:24:26Z

Thank you!

@bors r=@nnethercote

bors · 2023-12-04T01:24:29Z

📌 Commit c1774a1 has been approved by nnethercote

It is now in the queue for this repository.

bors · 2023-12-04T01:58:08Z

⌛ Testing commit c1774a1 with merge 53ded39...

…hercote Exhaustiveness: allocate memory better Exhaustiveness is a recursive algorithm that allocates a bunch of slices at every step. Let's see if I can improve performance by improving allocations. Already just using `Vec::with_capacity` is showing impressive improvements on my local measurements. r? `@ghost`

rust-log-analyzer · 2023-12-04T02:27:16Z

The job armhf-gnu failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

bors · 2023-12-04T02:27:44Z

💔 Test failed - checks-actions

Nadrieril · 2023-12-04T02:51:28Z

That looks spurious

@bors retry

bors · 2023-12-04T05:19:42Z

⌛ Testing commit c1774a1 with merge 80a897a...

…hercote Exhaustiveness: allocate memory better Exhaustiveness is a recursive algorithm that allocates a bunch of slices at every step. Let's see if I can improve performance by improving allocations. Already just using `Vec::with_capacity` is showing impressive improvements on my local measurements. r? `@ghost`

rust-log-analyzer · 2023-12-04T05:22:36Z

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

bors · 2023-12-04T05:22:42Z

💔 Test failed - checks-actions

Nadrieril · 2023-12-04T06:36:36Z

Spurious again (Too Many Requests from the docker registry)

@bors retry

bors · 2023-12-04T07:06:39Z

⌛ Testing commit c1774a1 with merge cf8d812...

bors · 2023-12-04T09:03:20Z

☀️ Test successful - checks-actions
Approved by: nnethercote
Pushing cf8d812 to master...

rust-timer · 2023-12-04T10:26:03Z

Finished benchmarking commit (cf8d812): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.4%, -0.2%]	4
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.8%	[3.8%, 3.8%]	1
Regressions ❌ (secondary)	2.8%	[2.2%, 3.3%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.8%	[3.8%, 3.8%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 673.692s -> 673.967s (0.04%)
Artifact size: 314.13 MiB -> 314.12 MiB (-0.00%)

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 1, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 1, 2023

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 1, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 1, 2023

This comment has been minimized.

Sign in to view

rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Dec 1, 2023

Nadrieril force-pushed the arena-alloc-matrix branch from eee8baf to 9e395df Compare December 2, 2023 01:44

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 4, 2023

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 4, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 4, 2023

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 4, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 4, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 4, 2023

bors merged commit cf8d812 into rust-lang:master Dec 4, 2023
12 checks passed

rustbot added this to the 1.76.0 milestone Dec 4, 2023

Nadrieril deleted the arena-alloc-matrix branch December 4, 2023 09:07

Nadrieril added the A-exhaustiveness-checking Relating to exhaustiveness / usefulness checking of patterns label Dec 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exhaustiveness: allocate memory better #118490

Exhaustiveness: allocate memory better #118490

Nadrieril commented Dec 1, 2023

Nadrieril commented Dec 1, 2023

This comment has been minimized.

bors commented Dec 1, 2023

bors commented Dec 1, 2023

This comment has been minimized.

rust-timer commented Dec 1, 2023

Nadrieril commented Dec 1, 2023 •

edited

Loading

Nadrieril commented Dec 1, 2023

This comment has been minimized.

bors commented Dec 1, 2023

bors commented Dec 1, 2023

bors commented Dec 1, 2023

This comment has been minimized.

the8472 commented Dec 1, 2023

rust-timer commented Dec 1, 2023

Nadrieril commented Dec 1, 2023

Nadrieril commented Dec 1, 2023 •

edited

Loading

nnethercote commented Dec 1, 2023

Nadrieril commented Dec 2, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

bors commented Dec 4, 2023

rust-log-analyzer commented Dec 4, 2023

bors commented Dec 4, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

rust-log-analyzer commented Dec 4, 2023

bors commented Dec 4, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

bors commented Dec 4, 2023

rust-timer commented Dec 4, 2023

Exhaustiveness: allocate memory better #118490

Exhaustiveness: allocate memory better #118490

Conversation

Nadrieril commented Dec 1, 2023

Nadrieril commented Dec 1, 2023

This comment has been minimized.

bors commented Dec 1, 2023

bors commented Dec 1, 2023

This comment has been minimized.

rust-timer commented Dec 1, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Nadrieril commented Dec 1, 2023 • edited Loading

Nadrieril commented Dec 1, 2023

This comment has been minimized.

bors commented Dec 1, 2023

bors commented Dec 1, 2023

bors commented Dec 1, 2023

This comment has been minimized.

the8472 commented Dec 1, 2023

rust-timer commented Dec 1, 2023

Overall result: ✅ improvements - no action needed

Nadrieril commented Dec 1, 2023

Nadrieril commented Dec 1, 2023 • edited Loading

nnethercote commented Dec 1, 2023

Nadrieril commented Dec 2, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

bors commented Dec 4, 2023

rust-log-analyzer commented Dec 4, 2023

bors commented Dec 4, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

rust-log-analyzer commented Dec 4, 2023

bors commented Dec 4, 2023

Nadrieril commented Dec 4, 2023

bors commented Dec 4, 2023

bors commented Dec 4, 2023

rust-timer commented Dec 4, 2023

Overall result: ✅ improvements - no action needed

Nadrieril commented Dec 1, 2023 •

edited

Loading

Nadrieril commented Dec 1, 2023 •

edited

Loading