Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Always update clusters and remove per-frame allocations #4169

Closed
wants to merge 7 commits into from

Conversation

cart
Copy link
Member

@cart cart commented Mar 10, 2022

  • Refactor assign_lights_to_clusters to always clear + update clusters, even if the screen size isn't available yet / is zero. This fixes multiple_windows example is broken #4167. We still avoid the "expensive" per-light work when the screen size isn't available yet. I also consolidated some logic to eliminate some redundancies.
  • Removed a ton of (potentially very large) per-frame reallocations
    • Removed Res<VisiblePointLights> (a vec) in favor of Res<GlobalVisiblePointLights> (a hashmap). We were allocating a new hashmap every frame, the collecting it into a vec every frame, then in another system re-generating the hashmap. It is always used like a hashmap, might as well embrace that. We now reuse the same hashmap every frame and dont use any intermediate collections.
    • We were re-allocating Clusters aabb and light vectors every frame by re-constructing Clusters every frame. We now re-use the existing collections.
    • Reuse per-camera VisiblePointLight vecs when possible instead of allocating them every frame. We now only insert VisiblePointLights if the component doesn't exist yet.

@cart cart added C-Bug An unexpected or incorrect behavior A-Rendering Drawing game state to the screen labels Mar 10, 2022
@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Mar 10, 2022
@cart cart removed the S-Needs-Triage This issue needs to be labelled label Mar 10, 2022
@alice-i-cecile alice-i-cecile added the C-Performance A change motivated by improving speed, memory usage or compile times label Mar 10, 2022
@alice-i-cecile
Copy link
Member

I'm curious about the perf impact; do we have a good benchmark for this?

Copy link
Contributor

@robtfm robtfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the regression. doing serial PRs on this area feels a bit like trying to replace the wheels on a car while driving it... but i hope i'll learn and improve as we go.

is it possible to get examples to run as part of CI?

crates/bevy_pbr/src/light.rs Outdated Show resolved Hide resolved
crates/bevy_pbr/src/light.rs Outdated Show resolved Hide resolved
crates/bevy_pbr/src/light.rs Outdated Show resolved Hide resolved
@bjorn3
Copy link
Contributor

bjorn3 commented Mar 10, 2022

is it possible to get examples to run as part of CI?

Part of them already run using swiftshader.

Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review.

crates/bevy_pbr/src/light.rs Show resolved Hide resolved
crates/bevy_pbr/src/light.rs Outdated Show resolved Hide resolved
crates/bevy_pbr/src/light.rs Show resolved Hide resolved
@cart
Copy link
Member Author

cart commented Mar 10, 2022

I'm curious about the perf impact; do we have a good benchmark for this?

I benchmarked lighting.rs prior to submitting this pr and the difference was in the noise. But that makes sense because there aren't that many lights in the scene (and therefore we have a smaller number of allocations + copies). We'd want to test on a scene with a large number of lights.

@cart
Copy link
Member Author

cart commented Mar 10, 2022

I think this is good to go now. Feel free to leave more feedback, but I'll merge this in the next few days if i dont hear back from anyone.

@robtfm
Copy link
Contributor

robtfm commented Mar 10, 2022

looks good. in testing i found an unrelated issue, if you want to include it that would be great, otherwise i'll open another pr.

the obb test is occasionally failing in Single mode due to f32 precision issues. we can fix it by removing the 1e9s and calculating the actual depth we need:

@@ -323,7 +323,7 @@ impl ClusterConfig {
     fn first_slice_depth(&self) -> f32 {
         match self {
             ClusterConfig::None => 0.0,
-            ClusterConfig::Single => 1.0e9, // FIXME note can't use f32::MAX as the aabb explodes
+            ClusterConfig::Single => 0.0,
             ClusterConfig::XYZ { z_config, .. } | ClusterConfig::FixedZ { z_config, .. } => {
                 z_config.first_slice_depth
             }
@@ -333,7 +333,7 @@ impl ClusterConfig {
     fn far_z_mode(&self) -> ClusterFarZMode {
         match self {
             ClusterConfig::None => ClusterFarZMode::Constant(0.0),
-            ClusterConfig::Single => ClusterFarZMode::Constant(1.0e9), // FIXME note can't use f32::MAX as the aabb explodes
+            ClusterConfig::Single => ClusterFarZMode::MaxLightRange,
             ClusterConfig::XYZ { z_config, .. } | ClusterConfig::FixedZ { z_config, .. } => {
                 z_config.far_z_mode
             }

Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It would be good to add in robtfm's noted fix too.

@superdump
Copy link
Contributor

@cart ping? I want to get this in as I have a couple more PRs that will come on top of this. I’ll add some many lights example while I’m at it I guess so we have something to test with. I’ve used Sponza locally but it’s not really necessary. :)

@cart
Copy link
Member Author

cart commented Mar 24, 2022

Ill update this now and get it merged!

@cart
Copy link
Member Author

cart commented Mar 24, 2022

bors r+

bors bot pushed a commit that referenced this pull request Mar 24, 2022
 * Refactor assign_lights_to_clusters to always clear + update clusters, even if the screen size isn't available yet / is zero. This fixes #4167. We still avoid the "expensive" per-light work when the screen size isn't available yet. I also consolidated some logic to eliminate some redundancies.
* Removed _a ton_ of (potentially very large) per-frame reallocations
  * Removed `Res<VisiblePointLights>` (a vec) in favor of  `Res<GlobalVisiblePointLights>` (a hashmap). We were allocating a new hashmap every frame, the collecting it into a vec every frame, then in another system _re-generating the hashmap_. It is always used like a hashmap, might as well embrace that. We now reuse the same hashmap every frame and dont use any intermediate collections.
  * We were re-allocating Clusters aabb and light vectors every frame by re-constructing Clusters every frame. We now re-use the existing collections.
  * Reuse per-camera VisiblePointLight vecs when possible instead of allocating them every frame. We now only insert VisiblePointLights if the component doesn't exist yet.
@bors bors bot changed the title Always update clusters and remove per-frame allocations [Merged by Bors] - Always update clusters and remove per-frame allocations Mar 24, 2022
@bors bors bot closed this Mar 24, 2022
aevyrie pushed a commit to aevyrie/bevy that referenced this pull request Jun 7, 2022
)

 * Refactor assign_lights_to_clusters to always clear + update clusters, even if the screen size isn't available yet / is zero. This fixes bevyengine#4167. We still avoid the "expensive" per-light work when the screen size isn't available yet. I also consolidated some logic to eliminate some redundancies.
* Removed _a ton_ of (potentially very large) per-frame reallocations
  * Removed `Res<VisiblePointLights>` (a vec) in favor of  `Res<GlobalVisiblePointLights>` (a hashmap). We were allocating a new hashmap every frame, the collecting it into a vec every frame, then in another system _re-generating the hashmap_. It is always used like a hashmap, might as well embrace that. We now reuse the same hashmap every frame and dont use any intermediate collections.
  * We were re-allocating Clusters aabb and light vectors every frame by re-constructing Clusters every frame. We now re-use the existing collections.
  * Reuse per-camera VisiblePointLight vecs when possible instead of allocating them every frame. We now only insert VisiblePointLights if the component doesn't exist yet.
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
)

 * Refactor assign_lights_to_clusters to always clear + update clusters, even if the screen size isn't available yet / is zero. This fixes bevyengine#4167. We still avoid the "expensive" per-light work when the screen size isn't available yet. I also consolidated some logic to eliminate some redundancies.
* Removed _a ton_ of (potentially very large) per-frame reallocations
  * Removed `Res<VisiblePointLights>` (a vec) in favor of  `Res<GlobalVisiblePointLights>` (a hashmap). We were allocating a new hashmap every frame, the collecting it into a vec every frame, then in another system _re-generating the hashmap_. It is always used like a hashmap, might as well embrace that. We now reuse the same hashmap every frame and dont use any intermediate collections.
  * We were re-allocating Clusters aabb and light vectors every frame by re-constructing Clusters every frame. We now re-use the existing collections.
  * Reuse per-camera VisiblePointLight vecs when possible instead of allocating them every frame. We now only insert VisiblePointLights if the component doesn't exist yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior C-Performance A change motivated by improving speed, memory usage or compile times
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

multiple_windows example is broken
5 participants