Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't define HAS_CUSTOM_BLOCKS on mono #106764

Merged
merged 1 commit into from
Aug 22, 2024
Merged

Don't define HAS_CUSTOM_BLOCKS on mono #106764

merged 1 commit into from
Aug 22, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Aug 21, 2024

@EgorBo
Copy link
Member Author

EgorBo commented Aug 21, 2024

@EgorBot -mono --filter System.Memory.ReadOnlyMemory.ToArray

@EgorBot
Copy link

EgorBot commented Aug 21, 2024

Benchmark results on Intel
BenchmarkDotNet v0.13.13-nightly.20240311.145, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
  Job-HTXVRI : .NET 9.0.0 (42.42.42.42424) using MonoVM, X64 X86Base
  Job-WOJFSK : .NET 9.0.0 (42.42.42.42424) using MonoVM, X64 X86Base
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Toolchain Size Mean Error Ratio Gen0 Allocated Alloc Ratio
ToArray Main 512 948.9 ns 3.19 ns 1.00 0.2504 1.03 KB 1.00
ToArray PR 512 196.8 ns 1.75 ns 0.21 0.2521 1.03 KB 1.00

BDN_Artifacts.zip

@EgorBo
Copy link
Member Author

EgorBo commented Aug 21, 2024

@matouskozak PTAL, it fixes the issue according to the benchmark in #106764 (comment)

@matouskozak
Copy link
Member

/azp run runtime-extra-platforms

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@matouskozak matouskozak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix and help with investigating the issue.

I don't think that MonoJIT (or other Mono codegens) have this guarantee to unroll blocks (<= 64 bytes) so I think this is a good fix. @BrzVlad do you know otherwise?

@BrzVlad
Copy link
Member

BrzVlad commented Aug 22, 2024

We do unroll copies, in mini_emit_memcpy, but we might not be hitting that path for whatever reason, or we might emit some redundant instructions. I think this fix is ok, but it is a sign that our copy function is not optimal.

@matouskozak
Copy link
Member

matouskozak commented Aug 22, 2024

mini_emit_memcpy

We do unroll copies, in mini_emit_memcpy, but we might not be hitting that path for whatever reason, or we might emit some redundant instructions. I think this fix is ok, but it is a sign that our copy function is not optimal.

Looking at the codepaths for [Write/Read]Unaligned we are going thru mini_emit_memory_[store/load] that doesn't pass thru mini_emit_memcpy but goes straight to EMIT_NEW_[STORE/LOAD]_MEMBASE_TYPE. So we are not even trying to take advantage of the mini_emit_memcpy copy unroll.

@BrzVlad
Copy link
Member

BrzVlad commented Aug 22, 2024

That should still emit a OP_STOREV_MEMBASE that would later be decomposed to some optimal copy sequence in mono_decompose_vtype_opts.

@matouskozak
Copy link
Member

We might want to backport this fix to .NET 9 due to the severity of the perf regressions dotnet/perf-autofiling-issues#33182 @jkurdek @vitek-karas.

That should still emit a OP_STOREV_MEMBASE that would later be decomposed to some optimal copy sequence in mono_decompose_vtype_opts.

Thank you, you're right. However, we are taking this path

mono_emit_method_call (cfg, mini_get_memcpy_method (), iargs, NULL);
instead of going thru mini_emit_memcpy because we pass size / align > MAX_INLINE_COPIES (klass min_align is 1). As a future work, we could try tweaking the MAX_INLINE_COPIES heuristics.

@EgorBo EgorBo merged commit 7266021 into dotnet:main Aug 22, 2024
187 of 194 checks passed
@EgorBo
Copy link
Member Author

EgorBo commented Aug 22, 2024

/backport to release/9.0

Copy link
Contributor

Started backporting to release/9.0: https://github.com/dotnet/runtime/actions/runs/10505141188

@xtqqczze
Copy link
Contributor

Looking at the codepaths for [Write/Read]Unaligned we are going thru mini_emit_memory_[store/load] that doesn't pass thru mini_emit_memcpy but goes straight to EMIT_NEW_[STORE/LOAD]_MEMBASE_TYPE. So we are not even trying to take advantage of the mini_emit_memcpy copy unroll.

Is there an issue tracking this missed optimization?

@matouskozak
Copy link
Member

Looking at the codepaths for [Write/Read]Unaligned we are going thru mini_emit_memory_[store/load] that doesn't pass thru mini_emit_memcpy but goes straight to EMIT_NEW_[STORE/LOAD]_MEMBASE_TYPE. So we are not even trying to take advantage of the mini_emit_memcpy copy unroll.

Is there an issue tracking this missed optimization?

I'm not aware of it so I created it #106822.

matouskozak added a commit to matouskozak/runtime that referenced this pull request Sep 9, 2024
github-actions bot pushed a commit that referenced this pull request Sep 11, 2024
carlossanlop pushed a commit that referenced this pull request Sep 12, 2024
This reverts commit 7266021.

Co-authored-by: Matous Kozak <matouskozak@seznam.cz>
jtschuster pushed a commit to jtschuster/runtime that referenced this pull request Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Perf] Linux/x64: 183 Regressions on 4/15/2024 6:25:38 PM
5 participants