Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - extract topsort logic to a new method, one pass to detect cycles and … #7727

Closed
wants to merge 4 commits into from

Conversation

shuoli84
Copy link
Contributor

@shuoli84 shuoli84 commented Feb 17, 2023

…top sort. reduce mem alloc

Objective

  • Reduce alloc count.
  • Improve code quality.

Solution

  • use TarjanScc::run directly, which calls a closure with each scc, in closure, we can detect cycles and flatten nodes

@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Feb 17, 2023
@alice-i-cecile
Copy link
Member

Please remove the Changelog and Migration Guide completely for this PR :) It's less distracting, and avoids any automated tools picking them up.

@alice-i-cecile alice-i-cecile added the C-Code-Quality A section of code that is hard to understand or change label Feb 17, 2023
@maniwani
Copy link
Contributor

maniwani commented Feb 17, 2023

Factoring this logic into a topsort_graph makes the code easier to read, but none of this really reduces memory allocation.

tarjan_scc produces a Vec<Vec<G::NodeId>>. The iterator version, TarjanScc::run, still produces a Vec<G::NodeId> for each strongly-connected component, and then you push them into cycles. So you end up with the same result. The outer vector may end up smaller, or it might not since capacity usually grows exponentially (e.g. capacity doubles each realloc).

Also, I don't understand why you renamed strongly_connected_components into cycles and nodes_with_cycles. That's less clear. SCCs contain cycles. They are not cycles themselves. I would change it back (or sccs_with_cycles).

@james7132
Copy link
Member

tarjan_scc produces a Vec<Vec<G::NodeId>>. The iterator version, TarjanScc::run, still produces a Vec<G::NodeId> for each strongly-connected component, and then you push them into cycles. So you end up with the same result. The outer vector may end up smaller, or it might not since capacity usually grows exponentially (e.g. capacity doubles each realloc).

Is there a way for us to early return (without catching a panic) as we're processing each SCC? It'd reduce the total memory usage to the maximum number of nodes in each SCC.

@maniwani
Copy link
Contributor

maniwani commented Feb 17, 2023

Is there a way for us to early return (without catching a panic) as we're processing each SCC? It'd reduce the total memory usage to the maximum number of nodes in each SCC.

Can you clarify what you're asking? The number of G::NodeId you get from tarjan_scc is always the number of nodes in the graph. They're just grouped into strongly-connected components.

If the question is: "Can we return as soon as we find an SCC with 2+ nodes?" I would ask, which do you prefer? Do you want all cycles reported, as in #7463? Then no. If you (and most users) are OK with maybe having to rebuild over and over to find and fix every cycle, then yes.

It's technically possible to avoid allocating a Vec<G::NodeId> when the SCC is a single node, using an enum:

enum StronglyConnectedComponent {
    Single(NodeId),
    Multiple(Vec<NodeId>),
}

But that would have to wait for bevy_graph.

The underlying algorithm doesn't actually need to allocate these while it's running. Putting the nodes of an SCC together in a Vec is just a "post-processing step". You can allocate each time you find one or wait until they're all found.

@shuoli84
Copy link
Contributor Author

shuoli84 commented Feb 18, 2023

Factoring this logic into a topsort_graph makes the code easier to read, but none of this really reduces memory allocation.

Maybe I missed something, so just made a test suit to verify whether mem alloc reduced. https://github.com/shuoli84/test_petgraph_scc_alloc. From the run, the mem alloc for schedule building reduced by around 30%, with system count 900, main alloc 8395 times, vs the optimized version 5690.

tarjan_scc produces a Vec<Vec<G::NodeId>>. The iterator version, TarjanScc::run, still produces a Vec<G::NodeId> for each strongly-connected component, and then you push them into cycles. So you end up with the same result. The outer vector may end up smaller, or it might not since capacity usually grows exponentially (e.g. capacity doubles each realloc).

Also, I don't understand why you renamed strongly_connected_components into cycles and nodes_with_cycles. That's less clear. SCCs contain cycles. They are not cycles themselves. I would change it back (or sccs_with_cycles).

renamed it to sccs_with_cycles.

EDIT: format

@maniwani
Copy link
Contributor

maniwani commented Feb 19, 2023

shuoli84/test_petgraph_scc_alloc. From the run, the mem alloc for schedule building reduced by around 30%, with system count 900, main alloc 8395 times, vs the optimized version 5690.

Oh, I see now. You prepare the topsorted vector in one allocation. I spoke too soon. Sorry about that! Good catch!

// This is allocated once and never needs to reallocate.
let mut rev_top_sorted_nodes = Vec::<NodeId>::with_capacity(graph.node_count());
/* ... */

tarjan_scc.run(graph, |scc| {
    /* ... */
    // If this graph is acyclic, then we can just reverse this when we're done.
    rev_top_sorted_nodes.extend_from_slice(scc);
});

Yeah, the flattening and collect in sccs.into_iter().flatten().rev().collect::<Vec<_>>() that's in main is definitely way more expensive since we're squashing N vectors into 1.

Will approve once you resolve the CI errors.

@maniwani maniwani added this to the 0.10 milestone Feb 19, 2023
@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Feb 19, 2023
@alice-i-cecile
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Feb 19, 2023
#7727)

…top sort. reduce mem alloc

# Objective

- Reduce alloc count.
- Improve code quality.

## Solution

- use `TarjanScc::run` directly, which calls a closure with each scc, in closure, we can detect cycles and flatten nodes
@bors bors bot changed the title extract topsort logic to a new method, one pass to detect cycles and … [Merged by Bors] - extract topsort logic to a new method, one pass to detect cycles and … Feb 19, 2023
@bors bors bot closed this Feb 19, 2023
myreprise1 pushed a commit to myreprise1/bevy that referenced this pull request Mar 23, 2023
bevyengine#7727)

…top sort. reduce mem alloc

# Objective

- Reduce alloc count.
- Improve code quality.

## Solution

- use `TarjanScc::run` directly, which calls a closure with each scc, in closure, we can detect cycles and flatten nodes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants