Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override QueryIter::fold to port Query::for_each perf gains to select Iterator combinators #6773

Merged
merged 30 commits into from
Dec 1, 2023

Conversation

james7132
Copy link
Member

@james7132 james7132 commented Nov 27, 2022

Objective

After #6547, Query::for_each has been capable of automatic vectorization on certain queries, which is seeing a notable (>50% CPU time improvements) for iteration. However, Query::for_each isn't idiomatic Rust, and lacks the flexibility of iterator combinators.

Ideally, Query::iter and friends should be able to achieve the same results. However, this does seem to blocked upstream (rust-lang/rust#104914) by Rust's loop optimizations.

Solution

This is an intermediate solution and refactor. This moves the Query::for_each implementation onto the Iterator::fold implementation for QueryIter instead. This should result in the same automatic vectorization optimization on all Iterator functions that internally use fold, including Iterator::for_each, Iterator::count, etc.

With this, it should close the gap between the two completely. Internally, this PR changes Query::for_each to use query.iter().for_each(..) instead of the duplicated implementation.

Separately, the duplicate implementations of internal iteration (i.e. Query::par_for_each) now use portions of the current Query::for_each implementation factored out into their own functions.

This also massively cleans up our internal fragmentation of internal iteration options, deduplicating the iteration code used in for_each and par_iter().for_each().


Changelog

Changed: Query::for_each, Query::for_each_mut, Query::for_each, and Query::for_each_mut have been moved to QueryIter's Iterator::for_each implementation, and still retains their performance improvements over normal iteration. These APIs are deprecated in 0.13 and will be removed in 0.14.

@james7132 james7132 added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Nov 27, 2022
@scottmcm
Copy link
Contributor

Oh, man, this hadn't happened yet? It came up on Discord back in Feb 2021; I guess I should have opened an issue.

As an FYI, there's a chance that implementing try_fold will happen even before all the traits are stabilized: rust-lang/rust#84277 (comment). If you end up needing that, please post about it to the tracking issue.

@james7132
Copy link
Member Author

james7132 commented Feb 18, 2023

Finally was able to address the performance issues, so I'm taking this out of draft. Benchmark results:

group                                           main                                    query-iter-fold
-----                                           ----                                    ---------------
add_remove/sparse_set                           1.00   783.2±35.54µs        ? ?/sec     1.03   808.5±73.03µs        ? ?/sec
add_remove/table                                1.00   1141.2±8.12µs        ? ?/sec     1.03  1178.1±35.55µs        ? ?/sec
add_remove_big/sparse_set                       1.00  901.6±213.54µs        ? ?/sec     1.04  936.2±248.48µs        ? ?/sec
add_remove_big/table                            1.00      2.4±0.01ms        ? ?/sec     1.01      2.4±0.06ms        ? ?/sec
added_archetypes/archetype_count/100            1.15   261.0±11.59µs        ? ?/sec     1.00   226.4±10.16µs        ? ?/sec
added_archetypes/archetype_count/1000           1.30   901.8±57.76µs        ? ?/sec     1.00   695.1±40.32µs        ? ?/sec
added_archetypes/archetype_count/10000          1.00      9.5±0.53ms        ? ?/sec     1.02      9.7±0.77ms        ? ?/sec
added_archetypes/archetype_count/200            1.15   330.8±15.14µs        ? ?/sec     1.00   286.8±18.38µs        ? ?/sec
added_archetypes/archetype_count/2000           1.20  1601.8±181.73µs        ? ?/sec    1.00  1332.3±105.68µs        ? ?/sec
added_archetypes/archetype_count/500            1.29   581.2±35.53µs        ? ?/sec     1.00   452.2±33.39µs        ? ?/sec
added_archetypes/archetype_count/5000           1.00      3.5±0.20ms        ? ?/sec     1.00      3.5±0.24ms        ? ?/sec
build_schedule/1000_schedule                    1.00       3.4±0.07s        ? ?/sec     1.01       3.4±0.09s        ? ?/sec
build_schedule/1000_schedule_noconstraints      1.00    154.3±3.29ms        ? ?/sec     1.00    154.6±3.22ms        ? ?/sec
build_schedule/100_schedule                     1.02     19.4±0.08ms        ? ?/sec     1.00     18.9±0.13ms        ? ?/sec
build_schedule/100_schedule_noconstraints       1.19      2.1±0.02ms        ? ?/sec     1.00  1790.3±31.90µs        ? ?/sec
build_schedule/500_schedule                     1.00    622.0±5.47ms        ? ?/sec     1.06   661.9±12.40ms        ? ?/sec
build_schedule/500_schedule_noconstraints       1.03     37.9±0.40ms        ? ?/sec     1.00     36.9±0.73ms        ? ?/sec
busy_systems/01x_entities_03_systems            1.25     40.1±1.40µs        ? ?/sec     1.00     32.0±1.42µs        ? ?/sec
busy_systems/01x_entities_06_systems            1.24     72.5±1.57µs        ? ?/sec     1.00     58.3±2.88µs        ? ?/sec
busy_systems/01x_entities_09_systems            1.16     92.3±1.62µs        ? ?/sec     1.00     79.7±1.78µs        ? ?/sec
busy_systems/01x_entities_12_systems            1.14    114.5±2.03µs        ? ?/sec     1.00    100.3±3.22µs        ? ?/sec
busy_systems/01x_entities_15_systems            1.20    138.7±2.29µs        ? ?/sec     1.00    115.1±3.32µs        ? ?/sec
busy_systems/02x_entities_03_systems            1.18     57.3±1.76µs        ? ?/sec     1.00     48.4±2.55µs        ? ?/sec
busy_systems/02x_entities_06_systems            1.24    106.2±2.51µs        ? ?/sec     1.00     85.5±3.02µs        ? ?/sec
busy_systems/02x_entities_09_systems            1.18    146.6±4.28µs        ? ?/sec     1.00    124.7±4.55µs        ? ?/sec
busy_systems/02x_entities_12_systems            1.17    183.7±4.88µs        ? ?/sec     1.00    156.9±4.99µs        ? ?/sec
busy_systems/02x_entities_15_systems            1.16    222.8±6.38µs        ? ?/sec     1.00    191.8±4.95µs        ? ?/sec
busy_systems/03x_entities_03_systems            1.18     74.6±3.76µs        ? ?/sec     1.00     63.4±2.35µs        ? ?/sec
busy_systems/03x_entities_06_systems            1.40    153.2±4.98µs        ? ?/sec     1.00    109.6±3.39µs        ? ?/sec
busy_systems/03x_entities_09_systems            1.17    196.4±7.89µs        ? ?/sec     1.00    167.9±4.04µs        ? ?/sec
busy_systems/03x_entities_12_systems            1.18    255.1±6.12µs        ? ?/sec     1.00    216.4±6.41µs        ? ?/sec
busy_systems/03x_entities_15_systems            1.16    312.1±9.17µs        ? ?/sec     1.00    269.3±8.96µs        ? ?/sec
busy_systems/04x_entities_03_systems            1.17     95.6±4.21µs        ? ?/sec     1.00     81.7±2.76µs        ? ?/sec
busy_systems/04x_entities_06_systems            1.10    154.3±7.39µs        ? ?/sec     1.00    140.5±8.90µs        ? ?/sec
busy_systems/04x_entities_09_systems            1.20    249.7±7.97µs        ? ?/sec     1.00    207.5±7.37µs        ? ?/sec
busy_systems/04x_entities_12_systems            1.23   339.1±12.19µs        ? ?/sec     1.00    275.6±7.88µs        ? ?/sec
busy_systems/04x_entities_15_systems            1.25   425.0±14.00µs        ? ?/sec     1.00   339.3±13.37µs        ? ?/sec
busy_systems/05x_entities_03_systems            1.22    116.1±5.73µs        ? ?/sec     1.00     95.4±4.06µs        ? ?/sec
busy_systems/05x_entities_06_systems            1.21    206.6±9.36µs        ? ?/sec     1.00    170.9±4.78µs        ? ?/sec
busy_systems/05x_entities_09_systems            1.25   308.7±18.26µs        ? ?/sec     1.00    246.2±5.96µs        ? ?/sec
busy_systems/05x_entities_12_systems            1.32   430.0±19.66µs        ? ?/sec     1.00    324.7±9.28µs        ? ?/sec
busy_systems/05x_entities_15_systems            1.24   501.0±11.96µs        ? ?/sec     1.00    403.3±9.83µs        ? ?/sec
contrived/01x_entities_03_systems               1.27     36.2±1.16µs        ? ?/sec     1.00     28.4±1.42µs        ? ?/sec
contrived/01x_entities_06_systems               1.28     58.0±1.10µs        ? ?/sec     1.00     45.3±1.36µs        ? ?/sec
contrived/01x_entities_09_systems               1.30     77.7±1.47µs        ? ?/sec     1.00     59.6±2.95µs        ? ?/sec
contrived/01x_entities_12_systems               1.30     97.7±1.31µs        ? ?/sec     1.00     75.4±2.91µs        ? ?/sec
contrived/01x_entities_15_systems               1.27    114.3±8.74µs        ? ?/sec     1.00     90.0±4.64µs        ? ?/sec
contrived/02x_entities_03_systems               1.19     45.1±1.10µs        ? ?/sec     1.00     37.8±1.92µs        ? ?/sec
contrived/02x_entities_06_systems               1.21     77.2±1.62µs        ? ?/sec     1.00     63.9±2.74µs        ? ?/sec
contrived/02x_entities_09_systems               1.19    103.4±1.85µs        ? ?/sec     1.00     86.6±3.58µs        ? ?/sec
contrived/02x_entities_12_systems               1.22    134.5±2.26µs        ? ?/sec     1.00    110.0±4.38µs        ? ?/sec
contrived/02x_entities_15_systems               1.24    163.1±2.54µs        ? ?/sec     1.00    131.1±4.89µs        ? ?/sec
contrived/03x_entities_03_systems               1.19     54.5±1.74µs        ? ?/sec     1.00     45.7±1.91µs        ? ?/sec
contrived/03x_entities_06_systems               1.36     96.6±3.39µs        ? ?/sec     1.00     71.2±2.63µs        ? ?/sec
contrived/03x_entities_09_systems               1.31    129.8±4.89µs        ? ?/sec     1.00     99.1±2.95µs        ? ?/sec
contrived/03x_entities_12_systems               1.32    164.7±2.86µs        ? ?/sec     1.00    124.7±4.15µs        ? ?/sec
contrived/03x_entities_15_systems               1.33    200.1±4.71µs        ? ?/sec     1.00    150.7±4.68µs        ? ?/sec
contrived/04x_entities_03_systems               1.20     63.3±1.55µs        ? ?/sec     1.00     52.7±2.79µs        ? ?/sec
contrived/04x_entities_06_systems               1.34    112.3±2.68µs        ? ?/sec     1.00     83.6±2.81µs        ? ?/sec
contrived/04x_entities_09_systems               1.29    151.3±2.71µs        ? ?/sec     1.00    117.4±3.46µs        ? ?/sec
contrived/04x_entities_12_systems               1.32    194.1±4.25µs        ? ?/sec     1.00    147.3±4.79µs        ? ?/sec
contrived/04x_entities_15_systems               1.33    239.3±7.97µs        ? ?/sec     1.00    179.4±4.79µs        ? ?/sec
contrived/05x_entities_03_systems               1.19     69.3±1.63µs        ? ?/sec     1.00     58.2±1.63µs        ? ?/sec
contrived/05x_entities_06_systems               1.27    121.1±2.50µs        ? ?/sec     1.00     95.1±4.32µs        ? ?/sec
contrived/05x_entities_09_systems               1.27    166.2±3.55µs        ? ?/sec     1.00    131.0±4.68µs        ? ?/sec
contrived/05x_entities_12_systems               1.27    218.8±4.35µs        ? ?/sec     1.00    172.7±7.30µs        ? ?/sec
contrived/05x_entities_15_systems               1.27    268.7±8.17µs        ? ?/sec     1.00    211.0±8.86µs        ? ?/sec
empty_commands/0_entities                       1.07      4.4±0.02ns        ? ?/sec     1.00      4.1±0.02ns        ? ?/sec
empty_systems/000_systems                       1.00      6.1±0.15ns        ? ?/sec     1.00      6.1±0.09ns        ? ?/sec
empty_systems/001_systems                       1.85     17.4±1.10µs        ? ?/sec     1.00      9.4±2.41µs        ? ?/sec
empty_systems/002_systems                       1.26     18.7±1.14µs        ? ?/sec     1.00     14.8±1.74µs        ? ?/sec
empty_systems/003_systems                       1.24     19.8±1.07µs        ? ?/sec     1.00     16.0±1.02µs        ? ?/sec
empty_systems/004_systems                       1.49     19.9±1.29µs        ? ?/sec     1.00     13.4±0.77µs        ? ?/sec
empty_systems/005_systems                       1.36     19.6±1.59µs        ? ?/sec     1.00     14.4±0.97µs        ? ?/sec
empty_systems/010_systems                       1.28     27.5±0.75µs        ? ?/sec     1.00     21.5±1.44µs        ? ?/sec
empty_systems/015_systems                       1.44     36.8±0.53µs        ? ?/sec     1.00     25.6±2.57µs        ? ?/sec
empty_systems/020_systems                       1.32     41.1±1.60µs        ? ?/sec     1.00     31.1±3.33µs        ? ?/sec
empty_systems/025_systems                       1.27     47.2±1.01µs        ? ?/sec     1.00     37.3±2.43µs        ? ?/sec
empty_systems/030_systems                       1.35     53.5±1.18µs        ? ?/sec     1.00     39.7±2.71µs        ? ?/sec
empty_systems/035_systems                       1.39     59.7±1.32µs        ? ?/sec     1.00     42.8±2.83µs        ? ?/sec
empty_systems/040_systems                       1.39     68.1±2.71µs        ? ?/sec     1.00     48.9±3.06µs        ? ?/sec
empty_systems/045_systems                       1.31     75.9±1.49µs        ? ?/sec     1.00     57.8±2.94µs        ? ?/sec
empty_systems/050_systems                       1.38     84.6±2.02µs        ? ?/sec     1.00     61.2±4.58µs        ? ?/sec
empty_systems/055_systems                       1.36     93.4±3.12µs        ? ?/sec     1.00     68.5±4.42µs        ? ?/sec
empty_systems/060_systems                       1.38    100.9±2.31µs        ? ?/sec     1.00     73.3±4.36µs        ? ?/sec
empty_systems/065_systems                       1.31    109.3±3.51µs        ? ?/sec     1.00     83.3±3.58µs        ? ?/sec
empty_systems/070_systems                       1.31    116.9±2.85µs        ? ?/sec     1.00     89.4±3.10µs        ? ?/sec
empty_systems/075_systems                       1.32    125.4±2.59µs        ? ?/sec     1.00     94.8±3.58µs        ? ?/sec
empty_systems/080_systems                       1.30    133.2±3.27µs        ? ?/sec     1.00    102.2±2.76µs        ? ?/sec
empty_systems/085_systems                       1.31    142.0±2.28µs        ? ?/sec     1.00    108.4±3.65µs        ? ?/sec
empty_systems/090_systems                       1.26    149.2±2.61µs        ? ?/sec     1.00    118.1±3.51µs        ? ?/sec
empty_systems/095_systems                       1.24    157.1±3.51µs        ? ?/sec     1.00    126.4±3.88µs        ? ?/sec
empty_systems/100_systems                       1.23    163.3±2.72µs        ? ?/sec     1.00    133.3±3.63µs        ? ?/sec
fake_commands/2000_commands                     1.00      6.8±0.02µs        ? ?/sec     1.04      7.1±0.08µs        ? ?/sec
fake_commands/4000_commands                     1.00     13.7±0.05µs        ? ?/sec     1.04     14.3±0.03µs        ? ?/sec
fake_commands/6000_commands                     1.00     20.7±0.10µs        ? ?/sec     1.04     21.4±0.05µs        ? ?/sec
fake_commands/8000_commands                     1.00     27.6±0.10µs        ? ?/sec     1.04     28.6±0.07µs        ? ?/sec
get_or_spawn/batched                            1.02   365.4±11.94µs        ? ?/sec     1.00   359.2±11.34µs        ? ?/sec
get_or_spawn/individual                         1.02   544.0±40.05µs        ? ?/sec     1.00   535.7±41.96µs        ? ?/sec
heavy_compute/base                              1.02    214.4±1.33µs        ? ?/sec     1.00    211.0±1.88µs        ? ?/sec
insert_commands/insert                          1.00   452.4±38.52µs        ? ?/sec     1.00   452.8±38.28µs        ? ?/sec
insert_commands/insert_batch                    1.00   364.1±14.96µs        ? ?/sec     1.00   365.2±13.51µs        ? ?/sec
insert_simple/base                              1.01    431.7±3.70µs        ? ?/sec     1.00    425.7±2.33µs        ? ?/sec
insert_simple/unbatched                         1.00   728.8±12.66µs        ? ?/sec     1.05   762.0±13.74µs        ? ?/sec
iter_fragmented/base                            1.02    340.9±4.03ns        ? ?/sec     1.00    335.6±7.29ns        ? ?/sec
iter_fragmented/foreach                         1.00   166.0±25.83ns        ? ?/sec     1.09   180.3±35.82ns        ? ?/sec
iter_fragmented/foreach_wide                    1.00      3.7±0.05µs        ? ?/sec     1.24      4.6±0.05µs        ? ?/sec
iter_fragmented/wide                            1.00      3.8±0.10µs        ? ?/sec     1.01      3.9±0.12µs        ? ?/sec
iter_fragmented_sparse/base                     1.00      7.6±0.21ns        ? ?/sec     1.46     11.1±0.86ns        ? ?/sec
iter_fragmented_sparse/foreach                  1.02      7.9±0.25ns        ? ?/sec     1.00      7.8±0.12ns        ? ?/sec
iter_fragmented_sparse/foreach_wide             1.00     39.2±0.43ns        ? ?/sec     1.61     63.0±0.41ns        ? ?/sec
iter_fragmented_sparse/wide                     1.00     42.3±2.07ns        ? ?/sec     1.01     42.7±0.81ns        ? ?/sec
iter_simple/base                                1.00      8.3±0.02µs        ? ?/sec     1.01      8.3±0.03µs        ? ?/sec
iter_simple/foreach                             1.01      8.5±0.01µs        ? ?/sec     1.00      8.4±0.02µs        ? ?/sec
iter_simple/foreach_sparse_set                  1.01     25.9±0.14µs        ? ?/sec     1.00     25.6±0.35µs        ? ?/sec
iter_simple/foreach_wide                        1.00     41.8±0.39µs        ? ?/sec     1.11     46.3±0.71µs        ? ?/sec
iter_simple/foreach_wide_sparse_set             1.13   130.0±57.31µs        ? ?/sec     1.00    114.5±0.91µs        ? ?/sec
iter_simple/sparse_set                          1.00     28.8±0.21µs        ? ?/sec     1.02     29.4±0.23µs        ? ?/sec
iter_simple/system                              1.00      8.3±0.02µs        ? ?/sec     1.00      8.3±0.02µs        ? ?/sec
iter_simple/wide                                1.00     39.7±0.83µs        ? ?/sec     1.03     40.8±1.26µs        ? ?/sec
iter_simple/wide_sparse_set                     1.01    128.1±0.81µs        ? ?/sec     1.00    126.4±1.22µs        ? ?/sec
no_archetypes/system_count/0                    1.00      6.1±0.13ns        ? ?/sec     1.00      6.1±0.06ns        ? ?/sec
no_archetypes/system_count/100                  1.28    164.2±2.62µs        ? ?/sec     1.00    128.5±4.39µs        ? ?/sec
no_archetypes/system_count/20                   1.49     39.4±0.67µs        ? ?/sec     1.00     26.5±1.16µs        ? ?/sec
no_archetypes/system_count/40                   1.60     67.0±1.84µs        ? ?/sec     1.00     41.9±2.38µs        ? ?/sec
no_archetypes/system_count/60                   1.55    100.7±1.53µs        ? ?/sec     1.00     65.1±3.83µs        ? ?/sec
no_archetypes/system_count/80                   1.36    132.4±1.73µs        ? ?/sec     1.00     97.5±3.17µs        ? ?/sec
query_get/50000_entities_sparse                 1.00    289.0±3.51µs        ? ?/sec     1.00    288.9±0.83µs        ? ?/sec
query_get/50000_entities_table                  1.00    261.5±1.57µs        ? ?/sec     1.00    262.2±1.24µs        ? ?/sec
query_get_component/50000_entities_sparse       1.00    694.2±4.82µs        ? ?/sec     1.01   697.8±18.39µs        ? ?/sec
query_get_component/50000_entities_table        1.01    610.7±3.81µs        ? ?/sec     1.00    604.3±4.82µs        ? ?/sec
query_get_component_simple/system               1.00    579.7±6.19µs        ? ?/sec     1.03  599.8±107.32µs        ? ?/sec
query_get_component_simple/unchecked            1.00    686.4±4.34µs        ? ?/sec     1.00    686.4±9.30µs        ? ?/sec
query_get_many_10/50000_calls_sparse            1.00      4.4±0.40ms        ? ?/sec     1.05      4.6±0.54ms        ? ?/sec
query_get_many_10/50000_calls_table             1.00      4.0±0.38ms        ? ?/sec     1.04      4.2±0.39ms        ? ?/sec
query_get_many_2/50000_calls_sparse             1.01   639.0±85.54µs        ? ?/sec     1.00   630.4±51.38µs        ? ?/sec
query_get_many_2/50000_calls_table              1.00   657.6±53.77µs        ? ?/sec     1.00   655.5±41.88µs        ? ?/sec
query_get_many_5/50000_calls_sparse             1.00      2.0±0.25ms        ? ?/sec     1.05      2.1±0.35ms        ? ?/sec
query_get_many_5/50000_calls_table              1.00  1790.3±76.26µs        ? ?/sec     1.01  1810.1±123.77µs        ? ?/sec
schedule/base                                   1.12     37.4±1.86µs        ? ?/sec     1.00     33.5±1.86µs        ? ?/sec
sized_commands_0_bytes/2000_commands            1.00      4.4±0.03µs        ? ?/sec     1.01      4.4±0.05µs        ? ?/sec
sized_commands_0_bytes/4000_commands            1.01      8.8±0.04µs        ? ?/sec     1.00      8.8±0.45µs        ? ?/sec
sized_commands_0_bytes/6000_commands            1.01     13.3±0.06µs        ? ?/sec     1.00     13.1±0.03µs        ? ?/sec
sized_commands_0_bytes/8000_commands            1.00     17.8±0.08µs        ? ?/sec     1.00     17.7±0.09µs        ? ?/sec
sized_commands_12_bytes/2000_commands           1.00      4.8±0.02µs        ? ?/sec     1.00      4.8±0.01µs        ? ?/sec
sized_commands_12_bytes/4000_commands           1.00      9.6±0.04µs        ? ?/sec     1.02      9.8±0.02µs        ? ?/sec
sized_commands_12_bytes/6000_commands           1.01     14.5±0.07µs        ? ?/sec     1.00     14.4±0.08µs        ? ?/sec
sized_commands_12_bytes/8000_commands           1.01     19.4±0.12µs        ? ?/sec     1.00     19.2±0.32µs        ? ?/sec
sized_commands_512_bytes/2000_commands          1.00     58.3±1.85µs        ? ?/sec     1.00     58.2±1.94µs        ? ?/sec
sized_commands_512_bytes/4000_commands          1.00    118.6±8.59µs        ? ?/sec     1.00    118.1±8.37µs        ? ?/sec
sized_commands_512_bytes/6000_commands          1.01   182.7±23.39µs        ? ?/sec     1.00   181.5±22.46µs        ? ?/sec
sized_commands_512_bytes/8000_commands          1.01   243.7±34.63µs        ? ?/sec     1.00   242.0±30.64µs        ? ?/sec
spawn_commands/2000_entities                    1.00    169.8±7.39µs        ? ?/sec     1.03    174.4±4.21µs        ? ?/sec
spawn_commands/4000_entities                    1.00   352.3±14.06µs        ? ?/sec     1.02    358.3±9.68µs        ? ?/sec
spawn_commands/6000_entities                    1.00   519.5±19.88µs        ? ?/sec     1.04   539.4±18.56µs        ? ?/sec
spawn_commands/8000_entities                    1.00   696.6±22.98µs        ? ?/sec     1.05   730.7±25.27µs        ? ?/sec
spawn_world/10000_entities                      1.00   825.2±71.83µs        ? ?/sec     1.04   855.0±76.43µs        ? ?/sec
spawn_world/1000_entities                       1.00     82.9±7.85µs        ? ?/sec     1.04     86.1±8.78µs        ? ?/sec
spawn_world/100_entities                        1.00      8.2±0.85µs        ? ?/sec     1.03      8.4±0.85µs        ? ?/sec
spawn_world/10_entities                         1.00   828.7±76.28ns        ? ?/sec     1.04   857.9±81.80ns        ? ?/sec
spawn_world/1_entities                          1.00     83.0±7.56ns        ? ?/sec     1.06     88.1±9.74ns        ? ?/sec
world_entity/50000_entities                     1.00    120.4±0.94µs        ? ?/sec     1.00    119.9±0.38µs        ? ?/sec
world_get/50000_entities_sparse                 1.01    202.8±0.96µs        ? ?/sec     1.00    201.5±0.73µs        ? ?/sec
world_get/50000_entities_table                  1.00    169.3±1.19µs        ? ?/sec     1.00    169.1±6.02µs        ? ?/sec
world_query_for_each/50000_entities_sparse      1.01     53.7±0.25µs        ? ?/sec     1.00     53.3±0.19µs        ? ?/sec
world_query_for_each/50000_entities_table       1.00     27.3±0.10µs        ? ?/sec     1.00     27.1±0.03µs        ? ?/sec
world_query_get/50000_entities_sparse           1.00     96.4±0.90µs        ? ?/sec     1.00     95.9±0.66µs        ? ?/sec
world_query_get/50000_entities_sparse_wide      1.02    195.3±1.77µs        ? ?/sec     1.00    191.8±0.86µs        ? ?/sec
world_query_get/50000_entities_table            1.00    154.5±1.01µs        ? ?/sec     1.00    155.2±1.07µs        ? ?/sec
world_query_get/50000_entities_table_wide       1.00    234.4±0.88µs        ? ?/sec     1.00    233.8±0.62µs        ? ?/sec
world_query_iter/50000_entities_sparse          1.00     54.3±0.28µs        ? ?/sec     1.01     54.9±6.69µs        ? ?/sec
world_query_iter/50000_entities_table           1.00     27.2±0.08µs        ? ?/sec     1.00     27.2±0.11µs        ? ?/sec

@james7132 james7132 marked this pull request as ready for review February 18, 2023 05:12
@james7132 james7132 added the C-Code-Quality A section of code that is hard to understand or change label Feb 18, 2023
Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great safety comments. I think the unsafe is absolutely worth it to close this gap. Agreed on the deprecation.

@alice-i-cecile
Copy link
Member

@InBetweenNames, I'd appreciate your review on this :)

Copy link
Member

@JoJoJet JoJoJet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work, it's good to see this get improved upon. I especially like how the folding behavior was split into two separate functions for fold_table and fold_archetype -- it feels much cleaner this way.

crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved
james7132 and others added 2 commits February 18, 2023 16:19
Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com>
Copy link
Member

@cart cart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on board for this change and the code looks good to me. Finally deprecating for_each!

@InBetweenNames
Copy link

Yeah this does make sense to me. The #6161 PR can be adapted to use the new style.

Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Nov 26, 2023
@alice-i-cecile
Copy link
Member

@james7132, unless I hear otherwise, I'm going to merge this tomorrow :)

james7132 added a commit to james7132/bevy_asm_tests that referenced this pull request Nov 26, 2023
@james7132
Copy link
Member Author

As a final sanity check, the codegen of this PR seems to show no tangible difference from what is in main right now: james7132/bevy_asm_tests@309947c#diff-4c4b34cf83f523fced3bd396ad7ab8e228b4d35bf65c1f0457f7e4e58b14ccc5.

@alice-i-cecile 👍

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Nov 28, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 28, 2023
@JMS55
Copy link
Contributor

JMS55 commented Nov 28, 2023

Changelog needs updating btw. Also, this is the first I'm hearing of it, but using for_each() instead for for x in query is faster?

@alice-i-cecile
Copy link
Member

@james7132 sorry, you lost the coin flip on the merge conflicts :( Merge it yourself when you're done <3

@scottmcm
Copy link
Contributor

@JMS55 Roughly, for loops can break but Iterator::for_each cannot, so sometimes using the method can thus save some work by not needing to worry about that. Whether it matters depends greatly on the exact iterator type in question -- it doesn't at all for Range or slice iterators -- and the overhead is usually small, so for a loop body doing a chunky amount of work the overhead of for is usually immeasurably small.

But it's easy to come up with examples where the Query iterator is non-trivial and the work to be done is simple, and thus using for_each instead can be a nice perf gain.

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 1, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 1, 2023
@james7132 james7132 added this pull request to the merge queue Dec 1, 2023
Merged via the queue into bevyengine:main with commit 2148518 Dec 1, 2023
22 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Dec 1, 2023
# Objective

Resolves Issue #10772.

## Solution

Added the deprecated warning for QueryState::for_each_unchecked, as
noted in the comments of PR #6773.
Followed the wording in the deprecation messages for `for_each` and
`for_each_mut`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants