-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tapgarden: improve fault injection in unit tests #1031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, thanks for the fix. LGTM, been running it on flake hunter for 20 minutes now without any failure.
I managed to get it to fail, using the flake hunter slightly adjusted to only run the
It's the same test but now it fails with a different error. Not sure what to make of this. Will try to run again, see if I can reproduce the error. |
And this one as well:
|
I hit this once and wasn't sure what to make of it either - bumping the timeout helped but that test case isn't using any of these functions to try and cause faults. |
Nice catch, I've seen this one before but I don't see an open issue for it. |
In this commit, we use atomic variables in the unit test mock interfaces to store state about requested failues. We also link requested confirmation failures to a specific conf request. Only a single failure event can be requested at a time, and the mock interfaces reset after that event. This should reduce state being unintentionally shared between test cases.
d36f4ba
to
f16c98f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those racy errors sure are finicky. I like the smart solution.
I can only reproduce the timeout, but that seems unrelated. Will make an issue for the other error.
Addresses #974 .
The failure observed in CI was an empty confirmation event reached the caretaker in the test
basic_asset_creation
. That test case doesn't try to cause any faults in any mock interface, butfinalize_batch
does request that failure here:taproot-assets/tapgarden/planter_test.go
Line 1484 in 395a180
I concluded that the way we had implemented this fault injection was racy, so here we use atomic
Bool
s for tracking that state, that get unset after use.I cannot reproduce the original failures locally after 100 runs of the
unit-race pkg=tapgarden
target.