Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Migrate various GHashTables to dn_simdhash #100497

Closed
wants to merge 7 commits into from
Closed

Conversation

kg
Copy link
Contributor

@kg kg commented Apr 1, 2024

Migrates various GHashTables in the mono runtime over to dn_simdhash, and optimizes a couple specific spots in the process - in particular bundled resources, which had a very expensive compare function and hash function for keys.

Opening now to run CI tests so I can investigate any broken tests locally. I probably did other stuff in this branch that I can't remember, but it'll be obvious once the parent branch is squashed and I review the diff again...

There are more hash tables in mono that might be profitable to migrate based on profiling, but they're more complex scenarios, i.e. internal hash tables or concurrent hash tables, so I opted not to do them in this PR. In one case when I tried migrating one of them, things actually got slower.

@kg kg added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it labels Apr 1, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 1, 2024
@kg kg force-pushed the simdhash-2 branch 2 times, most recently from b1f04ae to c08f4fd Compare April 10, 2024 23:15
kg added 6 commits April 20, 2024 02:46
Checkpoint

Checkpoint

Checkpoint

Checkpoint

Checkpoint

Checkpoint

Checkpoint

Checkpoint

Checkpoint

Add missing files

Checkpoint

Add missing header

Cleanups and more comments

Checkpoint

Fix insertion

Remove generic add and get, they complicate everything for no reason

Build fix

Build fixes

Cleanup

When cascading, wrap around to the first bucket when we reach the end. This reduces wasted space a lot.

Simplify declaring specializations

Cleanup

Convert spaces to tabs

Rearrange things so that find_first_matching_suffix gets inlined

Basic removal implementation

Expand tests

Build fixes

Cleanup

MSVC doesn't support warning preprocessor command so use pragma message

Fix build

Fix build

Fix linker problem by defining string comparison and hasher inline

Match semantics of ghashtable better

Checkpoint

dn_allocator allocations aren't zeroed :)

Cleanup

Use dn_simdhash_string_ptr for namespace lookup cache

Fix build

Allow hiding the default specializations of a simdhash so you can wrap them
Make simdhash_string_ptr pre-compute and cache the length of the string

Cache the hashcode of keys in string_ptr since it makes natural cache alignment possible and speeds up rehashing

Rearrange things to fix C4505

Prohibit declaring multiple specializations in one .c file since it would cause duplicate includes

Maybe fix duplicate include

Fix use of base foreach for string_ptr hashtable

Annotate fallthrough in murmurhash for clang

Manually unroll the murmurhash3 duff's device to satisfy clang

Oops

Fix leaked dn_simdhash_t instance

Cleanup

Fix truncation warning with 64-bit strlen

Workaround linux x64 linker problem

Fix GCC build parse error

Cleanup & rearrange things; attempt to fix GCC build

Updates so test suite works again; verified on linux x64 gcc

Fix indentation

Cleanup

Fix foreaches

Split architecture-specific stuff into its own header

Specializing foreach means key_is_pointer/value_is_pointer aren't needed

Cleanup

Simplify preprocessor token gluing

Cleanup and improve comments

Align the address of the first bucket

Comment out neon for now

Cleanup

Alignment and arch-specific pointer size fixes

Fix ARM clang/gcc build

Maybe fix CI-specific build issue

MSVC docs were wrong :(

Cleanup

Silence MSVC warnings again

Rework string_ptr to avoid pre-computing string length, and use a streaming reformulation of MurmurHash3-32

Cleanup

Add dn_simdhash_u32_ptr and use it in mono_image_init_name_cache

Add missing file

Fix type mismatch

No need to eagerly load the suffixes, and doing so makes scalar fallback codegen much worse

Optimize scalar fallback

Migrate all simdhash fixes and changes from simdhash-2 into simdhash PR #1

Enable ARM NEON support in simdhash since we have system headers now

Update utils header from successor branch

Dedupe + improve 64-bit pointer hash

Fix windows build

Address PR feedback
Do some calculations using uint64_t and bounds check before converting back to uint32_t
Do some key calculations using size_t
Introduce dn_simdhash_assert and use it instead of assert

Broken attempt at configurable assertions

Add ptr_ptr simdhash variant

Use simdhash for interp patch_sites_table
Migrate method and methodref cache to simdhash
Add missing include, fix typo
Migrate data_hash and patchsite_hash to simdhash
Refactoring, add support for remove callback
Add ghashtable-compatible simdhash variant
Migrate 3 tables to dn_simdhash_ght
Change cascade flag into a cascade counter, and clean it up when removing items
Fix cascade counters potentially getting corrupt when trying to add duplicate keys
Attempt to optimize ght variant a little
Correct semantics in get_data_item_wide_index
Migrate bundled_resources to ght and optimize its hash and key comparer
Pre-reserve capacity for some simdhash instances that grow during startup
Reduce initial size of these simdhashes because we have one instance per assembly
Fix bundled_resources by adding ANOTHER hash table
Migrate MonoJitMemoryManager::seq_points to simdhash
Add equivalent operation for g_hash_table_insert_replace
Rename simdhash replace operation to make it clear that it only replaces the value
Adjust modified code to have the same overwrite behavior as the old g_hash_table
Probably meaningless semantic tweak for replace handler

Migrate jit_code_hash and interp_code_hash to simdhash_ptr_ptr

Add safety check

Wrap dn_simdhash_ght_foreach in a COMPONENT_API wrapper so debugger-engine can access it across a linking boundary

Bring back the InternalHashTables, removing them had performance consequences I can't identify the cause of

Fix test

Optimize pointer hash

(testing) force-enable wasm simd for simdhash

Wasm simd codegen tweak

Optimize simdhash search

Refactorings and possible speedup

Optimize bucket scanning

Bucket scan optimization

Repair merge damage

Repair merge damage
@kg kg added area-VM-meta-mono and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Apr 20, 2024
Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-meta-mono NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) NO-REVIEW Experimental/testing PR, do NOT review it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant