Optimize dropping in bevy_ecs #2897

TheRawMeatball · 2021-10-01T18:15:07Z

Previously, the type-erased drop impl was for dropping a single element, which meant a large number of fn pointer calls and having to iterate each component even for no-op drops. This PR moves the iteration to the generic part, which should enable noop drops to get optimized much further.

DJMcNab · 2021-10-01T19:14:32Z

crates/bevy_ecs/src/storage/blob_vec.rs

+    // SAFETY: The pointer points to a valid array of type `[T; len]` and it is safe to drop this value.
+    unsafe fn drop_multiple<T>(x: *mut u8, len: usize) {
+        for i in 0..len as isize {
+            x.cast::<T>().offset(i).drop_in_place()


Why is this not a free function in the blob_vec module?

It's repeated here and as an associated function of ComponentDescriptor

DJMcNab · 2021-10-01T19:14:52Z

crates/bevy_ecs/src/component.rs

+    // SAFETY: The pointer points to a valid array of type `[T; len]` and it is safe to drop this value.
+    unsafe fn drop_ptr_multiple<T>(x: *mut u8, len: usize) {
+        for i in 0..len as isize {
+            x.cast::<T>().offset(i).drop_in_place()


This should use ptr::add instead, to avoid the isize dance.

bjorn3 · 2021-10-01T19:27:06Z

crates/bevy_ecs/src/component.rs

-        x.cast::<T>().drop_in_place()
+    // SAFETY: The pointer points to a valid array of type `[T; len]` and it is safe to drop this value.
+    unsafe fn drop_ptr_multiple<T>(x: *mut u8, len: usize) {
+        for i in 0..len as isize {


Maybe write a manual while loop here instead? That produces less LLVM ir and as such is probably a bit faster in debug mode and takes a bit less time to compile. (all tiny bits help) If you think that is too ugly, it is fine to keep it as is though.

cart · 2021-10-01T21:04:57Z

Are we certain that this is actually a problem that needs fixing? Has anyone benchmarked this or looked at the generated assembly?

# Objective - We do a lot of function pointer calls in a hot loop (clearing entities in render). This is slow, since calling function pointers cannot be optimised out. We can avoid that in the cases where the function call is a no-op. - Alternative to #2897 - On my machine, in `many_cubes`, this reduces dropping time from ~150μs to ~80μs. ## Solution - Make `drop` in `BlobVec` an `Option`, recording whether the given drop impl is required or not. - Note that this does add branching in some cases - we could consider splitting this into two fields, i.e. unconditionally call the `drop` fn pointer. - My intuition of how often types stored in `World` should have non-trivial drops makes me think that would be slower, however. N.B. Even once this lands, we should still test having a 'drop_multiple' variant - for types with a real `Drop` impl, the current implementation is definitely optimal.

# Objective - We do a lot of function pointer calls in a hot loop (clearing entities in render). This is slow, since calling function pointers cannot be optimised out. We can avoid that in the cases where the function call is a no-op. - Alternative to bevyengine#2897 - On my machine, in `many_cubes`, this reduces dropping time from ~150μs to ~80μs. ## Solution - Make `drop` in `BlobVec` an `Option`, recording whether the given drop impl is required or not. - Note that this does add branching in some cases - we could consider splitting this into two fields, i.e. unconditionally call the `drop` fn pointer. - My intuition of how often types stored in `World` should have non-trivial drops makes me think that would be slower, however. N.B. Even once this lands, we should still test having a 'drop_multiple' variant - for types with a real `Drop` impl, the current implementation is definitely optimal.

alice-i-cecile · 2022-05-30T18:03:08Z

@TheRawMeatball if you're up for it, benchmarks on this would be great.

If you're busy with other stuff just let me know and we can mark this as abandoned for someone else to experiment with.

SkiFire13 · 2022-05-30T18:56:43Z

crates/bevy_ecs/src/component.rs

+        for i in 0..len as isize {
+            x.cast::<T>().offset(i).drop_in_place()
+        }


Couldn't this just be std::slice::from_raw_parts_mut(x.cast::<T>(), len).drop_in_place()?

It did have to be the std::ptr:: counterpart, but i think that could work.

# Objective - We do a lot of function pointer calls in a hot loop (clearing entities in render). This is slow, since calling function pointers cannot be optimised out. We can avoid that in the cases where the function call is a no-op. - Alternative to bevyengine#2897 - On my machine, in `many_cubes`, this reduces dropping time from ~150μs to ~80μs. ## Solution - Make `drop` in `BlobVec` an `Option`, recording whether the given drop impl is required or not. - Note that this does add branching in some cases - we could consider splitting this into two fields, i.e. unconditionally call the `drop` fn pointer. - My intuition of how often types stored in `World` should have non-trivial drops makes me think that would be slower, however. N.B. Even once this lands, we should still test having a 'drop_multiple' variant - for types with a real `Drop` impl, the current implementation is definitely optimal.

alice-i-cecile · 2024-01-02T21:30:35Z

Adopted!

Move iteration to generic part in drop impls

984ea7b

github-actions bot added the S-Needs-Triage This issue needs to be labelled label Oct 1, 2021

TheRawMeatball added A-ECS Entities, components, systems, and events C-Enhancement A new feature S-Needs-Review and removed S-Needs-Triage This issue needs to be labelled labels Oct 1, 2021

TheRawMeatball requested a review from DJMcNab October 1, 2021 18:20

DJMcNab approved these changes Oct 1, 2021

View reviewed changes

bjorn3 reviewed Oct 1, 2021

View reviewed changes

alice-i-cecile added the hacktoberfest-accepted label Nov 3, 2021

cart removed the S-Needs-Review label Dec 16, 2021

DJMcNab mentioned this pull request May 16, 2022

[Merged by Bors] - Skip drop when needs_drop is false #4773

Closed

alice-i-cecile added the S-Needs-Benchmarking This set of changes needs performance benchmarking to double-check that they help label May 30, 2022

TheRawMeatball added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label May 30, 2022

SkiFire13 reviewed May 30, 2022

View reviewed changes

tygyh mentioned this pull request Jan 2, 2024

Move iteration to generic part in drop impls (Adopted) #11183

Open

alice-i-cecile closed this Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize dropping in bevy_ecs #2897

Optimize dropping in bevy_ecs #2897

TheRawMeatball commented Oct 1, 2021

DJMcNab Oct 1, 2021

DJMcNab Oct 1, 2021

bjorn3 Oct 1, 2021

cart commented Oct 1, 2021

alice-i-cecile commented May 30, 2022

SkiFire13 May 30, 2022

bjorn3 May 30, 2022

alice-i-cecile commented Jan 2, 2024

Optimize dropping in bevy_ecs #2897

Optimize dropping in bevy_ecs #2897

Conversation

TheRawMeatball commented Oct 1, 2021

DJMcNab Oct 1, 2021

Choose a reason for hiding this comment

DJMcNab Oct 1, 2021

Choose a reason for hiding this comment

bjorn3 Oct 1, 2021

Choose a reason for hiding this comment

cart commented Oct 1, 2021

alice-i-cecile commented May 30, 2022

SkiFire13 May 30, 2022

Choose a reason for hiding this comment

bjorn3 May 30, 2022

Choose a reason for hiding this comment

alice-i-cecile commented Jan 2, 2024