Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ConsolidateRewards batch consolidation bugs #3002

Merged
merged 10 commits into from
Aug 27, 2024

Conversation

kuegi
Copy link
Contributor

@kuegi kuegi commented Aug 17, 2024

Summary

Fix for issue #3001 , ConsolidateRewards now uses set of CScript. It doesn't need/use the CAmount anyway.

If node was started with -consolidaterewards before, it needs a reindex or clean snapshot, as this likely lead to wrong balances before.

Implications

  • Storage

    • Database reindex required
    • Database reindex optional
    • Database reindex not required
    • None
  • Consensus

    • Network upgrade required
    • Includes backward compatible changes
    • Includes consensus workarounds
    • Includes consensus refactors
    • None

@prasannavl
Copy link
Member

Neat solution and clean-up. LGTM.

prasannavl
prasannavl previously approved these changes Aug 19, 2024
@prasannavl
Copy link
Member

prasannavl commented Aug 19, 2024

One minor suggestion would be limit the scope of the set to ConsolidateRewards to avoid multiple sources of truth that can become out of sync if someone extends the code at later point etc.

Create a temp set just before ConsolidateRewards and drop it right after (scoped), or just creating a set out of balancesToMigrate inside ConsolidateRewards without changing the signature, might both accomplish it (both places where size hint is available to preallocate as well)

@prasannavl prasannavl changed the title fix concurrency issue of ConsolidateRewards fix: concurrency issue of ConsolidateRewards Aug 19, 2024
@prasannavl prasannavl added p2 Medium priority bug labels Aug 19, 2024
@kuegi
Copy link
Contributor Author

kuegi commented Aug 19, 2024

One minor suggestion would be limit the scope of the set to ConsolidateRewards to avoid multiple sources of truth that can become out of sync if someone extends the code at later point etc.

Create a temp set just before ConsolidateRewards and drop it right after (scoped), or just creating a set out of balancesToMigrate inside ConsolidateRewards without changing the signature, might both accomplish it (both places where size hint is available to preallocate as well)

Can see if a scope is feasible in the few cases where its not single source of truth already.
Change of signature is on purpose. Wrong signatures (unused argument data, mismatching container for purpose of function...) are a bad code smell cause it leads to exactly such errors.

@prasannavl
Copy link
Member

prasannavl commented Aug 26, 2024

Thanks @kuegi.

Change of signature is on purpose. Wrong signatures (unused argument data, mismatching container for purpose of function...) are a bad code smell cause it leads to exactly such errors

The C++ idiomatic way to do this would be not to use the concrete container type in the signature at all and instead pass the iterator types begin, end and a size hint, and only use that on ConsolidateRewards, so vector or set doesn't matter.

At this point here, I think we just want to get this in quickly, so OK with either approach.

Can see if a scope is feasible in the few cases where its not single source of truth already

Thanks, works - this is what I had in mind: https://github.com/DeFiCh/ain/compare/pvl/pr-3002?expand=1#diff-872ba0cbdc244b4c2d8d90afb30f4842e386ff9a22ae22c90875a8a4327ec5e9R1727-R1738

That said: the reason for the hold up of this PR: it could change ordering. Since a set is used (std::set is RB tree, which keeps the sorted form). Need more testing to ensure there aren't edge case where this changes could result in unexpected or incompatible behavior.

A quicker workaround would just be use unsorted set.

@prasannavl
Copy link
Member

prasannavl commented Aug 26, 2024

Here's my branch with it switched to unordered set, that retains ordering the same as balancesToMigrate so that we don't have to worry about this at all: 9bd2d25

In addition, should also fix another bug if it's consolidate rewards is called with both empty args and tokens.

Verifying full sync. Will merge it back into this PR once ready.

@prasannavl prasannavl changed the title fix: concurrency issue of ConsolidateRewards fix: ConsolidateRewards batch consolidation bugs Aug 26, 2024
@kuegi
Copy link
Contributor Author

kuegi commented Aug 26, 2024

The C++ idiomatic way to do this would be not to use the concrete container type in the signature at all and instead pass the iterator types begin, end and a size hint, and only use that on ConsolidateRewards, so vector or set doesn't matter.

But that again would loose the uniqueness. It must not be a vector, but a set. If the arguments allow any container, the method internally must ensure uniqueness to prevent the bug.

@prasannavl
Copy link
Member

But that again would loose the uniqueness. It must not be a vector, but a set. If the arguments allow any container, the method internally must ensure uniqueness to prevent the bug.

Yes, but a container is just a layout. It's not the responsibility of the consolidate rewards method to ensure it's unique in this case. If it is, then we build the unique list inside ConsolidateRewards.

or just creating a set out of balancesToMigrate inside ConsolidateRewards

If it's not, then it's the responsibility of the outer method, in which case, ConsolidateRewards just iterates over (this can be a vector that already has the unique list).

Anyway, unsorted_set should take care of it (which is internally, just a vector with a hash table to ensure O(1) is there check to insert).

@prasannavl
Copy link
Member

@kuegi could you merge my branch: https://github.com/DeFiCh/ain/compare/pvl/pr-3002 into this if you get a chance? I'm off my desk a bit and don't see an active branch here to push into based off the web api.

@kuegi
Copy link
Contributor Author

kuegi commented Aug 26, 2024

Thanks, works - this is what I had in mind: https://github.com/DeFiCh/ain/compare/pvl/pr-3002?expand=1#diff-872ba0cbdc244b4c2d8d90afb30f4842e386ff9a22ae22c90875a8a4327ec5e9R1727-R1738

Why did you change the scoping compared to my version? Your version needs an extra loop and has totalAccounts outside, which is actually not used anywhere else. Any reason for that?

…wards

moved CScriptHasher to script so it can be found if needed somewhere else too.
@prasannavl
Copy link
Member

prasannavl commented Aug 26, 2024

Why did you change the scoping compared to my version? Your version needs an extra loop and has totalAccounts outside, which is actually not used anywhere else. Any reason for that?

Extra loop should have no significance here as it's fully contained in-memory. In fact, I'd expect this to be very slighly micro-optimized to be faster, since it pre-allocates the memory for the container that's needed due to known size.

(It's often a misconception to think that doing things looping over it again is slower than doing it in a single loop - from modern compiler optimization to cache pipelines - these are very hard to predict and often won't be the case, as looping over it going to be very cheap as opposed to the actual operation - so code clarity and code maintenance is what I'd focus on)

@prasannavl
Copy link
Member

prasannavl commented Aug 26, 2024

The reason to do it here, is to scope it as tightly as possible. So, it's a container set that's made for consolidaterewards and discarded right away, so it doesn't accidentally leak anywhere else and the source of truth still just remains as balancesToMigrate.

@prasannavl
Copy link
Member

I might have overlooked the totalAccounts part. Let me look through - might be bug that needs to be fixed

@prasannavl
Copy link
Member

Don't think this changed the behavior of totalAccounts in anyway here. Not sure what you mean on it's impact / significance of it in scope.

That said - I'd probably make a few more changes / cleanup, as this could be changing behavior:

https://github.com/DeFiCh/ain/pull/3002/files#diff-872ba0cbdc244b4c2d8d90afb30f4842e386ff9a22ae22c90875a8a4327ec5e9R1736-R1741

This sorts things and previous it was done pre-sort (and the set ended up doing the sort again). I have to clean this up a bit more.

@prasannavl
Copy link
Member

Just leaving a note for ref until fixed: the current state cannot be merged. As it has completely different sorting behavior. set with RB tree will sort based on CScript while the previous behavior was based on the balance.

Fix:

  • Use the unsorted_set after the sorting step.

@kuegi
Copy link
Contributor Author

kuegi commented Aug 26, 2024

Don't think this changed the behavior of totalAccounts in anyway here. Not sure what you mean on it's impact / significance of it in scope.

I just realized that totalAccounts is only used for ConsolidateRewards which is why I added it to the scope in my scope implementation that I pushed a week ago. You completely ignored my implementation, and I wondered why.

@kuegi
Copy link
Contributor Author

kuegi commented Aug 26, 2024

Just leaving a note for ref until fixed: the current state cannot be merged. As it has completely different sorting behavior. set with RB tree will sort based on CScript while the previous behavior was based on the balance.

Fix:

  • Use the unsorted_set after the sorting step.

You mean in the poolSplit case? IMHO ordering should be completely irrelevant (as its done multithreaded anyway, so ordering is "random" in the execution anyway), but can easily be fixed if important for you.

@prasannavl
Copy link
Member

prasannavl commented Aug 26, 2024

You mean in the poolSplit case? IMHO ordering should be completely irrelevant (as its done multithreaded anyway, so ordering is "random" in the execution anyway), but can easily be fixed if important for you.

Yes, agree, ordering in the "ideal" case should be irrelevant. But trying to stay close to the original implementation here, so that if there were any other bugs that we haven't discovered, it still matches the existing implementation (the order in which calc rewards is run) and doesn't deviate from it.

@prasannavl
Copy link
Member

It's all in random threads as well. So, in theory there should be no chance. Just trying to be extra cautious here and play it safe to reduce the amount of things to be mindful of, since we have a lot of things coming from the current update.

@kuegi
Copy link
Contributor Author

kuegi commented Aug 26, 2024

You mean in the poolSplit case? IMHO ordering should be completely irrelevant (as its done multithreaded anyway, so ordering is "random" in the execution anyway), but can easily be fixed if important for you.

Yes, agree, ordering in the "ideal" case should be irrelevant. But trying to stay close to the original implementation here, so that if there were any other bugs that we haven't discovered, it still matches the existing implementation (the order in which calc rewards is run) and doesn't deviate from it.

ok, will fix

@prasannavl
Copy link
Member

Thanks for this @kuegi. Merging.

@prasannavl prasannavl merged commit e918c19 into DeFiCh:master Aug 27, 2024
14 of 25 checks passed
@kuegi kuegi deleted the fix/consolidateRewards branch August 27, 2024 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ci/sync p2 Medium priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants