Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #3349 - enable -shuffle=on in tests #3350

Closed
wants to merge 5 commits into from

Conversation

rbroggi
Copy link
Contributor

@rbroggi rbroggi commented Oct 28, 2021

Which problem is this PR solving?

Short description of the changes

  • Enabling shuffling in tests

@rbroggi rbroggi requested a review from a team as a code owner October 28, 2021 19:42
@codecov
Copy link

codecov bot commented Oct 28, 2021

Codecov Report

Merging #3350 (14ed70d) into master (514e72b) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3350   +/-   ##
=======================================
  Coverage   96.47%   96.47%           
=======================================
  Files         259      259           
  Lines       15178    15178           
=======================================
  Hits        14643    14643           
- Misses        451      452    +1     
+ Partials       84       83    -1     
Impacted Files Coverage Δ
pkg/config/tlscfg/cert_watcher.go 92.63% <0.00%> (-2.11%) ⬇️
cmd/query/app/static_handler.go 97.00% <0.00%> (-0.60%) ⬇️
cmd/query/app/server.go 95.58% <0.00%> (+1.47%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 514e72b...14ed70d. Read the comment docs.

@rbroggi
Copy link
Contributor Author

rbroggi commented Oct 29, 2021

One of the nice features of shuffle flag is that whenever you run your tests with it on the command-line will print to stdout the seed of the shuffling so that if flakiness is experienced, it's easily reproduceable:

e.g.:

go test -v -race -shuffle=on ./... 
-test.shuffle 1635607575548663755

you can use the test shuffle to reproduce the issue:

go test -v -race -shuffle=1635607575548663755 ./... 

@yurishkuro
Copy link
Member

Generally I am not aware of inter-test dependencies, we always tried to keep tests independent. I see that TestAggregateThroughput is failing, which is a relatively recent test, some side-effect dependency might have sneaked in & we should fix it.

Signed-off-by: rbroggi <ro_broggi@hotmail.com>
…ritten to)

Signed-off-by: rbroggi <ro_broggi@hotmail.com>
p := &Processor{}
aggregatedThroughput := p.aggregateThroughput(testThroughputs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very likely that this global variable is the source of the flakiness as it's not read-only as it should therefore impacting also this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned by this approach. The input to p.aggregateThroughput is supposed to be treated as read-only by that function. If it's not, that might be the bug / source of the test dependencies, not the fault of the test. The testThroughputs is not a global variable, it's logically a constant. By using a function instead of the same value we simply obscure the bug elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I understand your point:

since method p.aggregateThroughput is supposed to treat its inputs as read-only, you think that my approach could hide a potential bug in the p.aggregateThroughput that could potentially mutate the input?

If this is what you mean I think my change is still valid because it's enforcing independence between tests. And what we could have is a specific test that tests for p.aggregateThroughput immutability. Currently the thing that is actually mutating the content of the global variable is a test on its body (see in master TestRealisticRunCalculationLoop in processor_test:445)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created the following test case and to your point it indeed fail (which means that the method is indeed mutating the arguments):

func TestAggregateThroughputInputsImmutability(t *testing.T) {
	p := &Processor{}
	in := testThroughputs()
	_ = p.aggregateThroughput(in)
	assert.Equal(t, in, testThroughputs())
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if indeed it's supposed not to mutate the arguments the issue is exactly here:

plugin/sampling/strategystore/adaptive/processor.go:313

Pointer semantics 😋

Signed-off-by: rbroggi <ro_broggi@hotmail.com>
@rbroggi
Copy link
Contributor Author

rbroggi commented Oct 31, 2021

Generally I am not aware of inter-test dependencies, we always tried to keep tests independent. I see that TestAggregateThroughput is failing, which is a relatively recent test, some side-effect dependency might have sneaked in & we should fix it.

Found the source of the bug: global variable 😌

My approach in the test was to change all the global variables of the file to methods which return the same value, this way we cannot share state across tests (even in future developments)

@rbroggi
Copy link
Contributor Author

rbroggi commented Oct 31, 2021

Update on this one:

TestHotReloadUIConfigTempFile fails now. I think shuffle is already paying some dividends spoting flakiness.
It will probably take a while for me to troubleshoot those issues as I'm not yet familiar with the code.

Another intriguing thing is that the all-in-one tests fail but only for the debug image, the release one works fine 🤔

Signed-off-by: rbroggi <ro_broggi@hotmail.com>
)
}

func TestAggregateThroughputInputsImmutability(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you run this test method on master branch it will fail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Given that the CI still fails due to some other issues, do you want to merge this PR without the -shuffle=on in the Makefile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will profitable to keep the interesting commenting history therefore:

#3360

@@ -310,12 +310,26 @@ func (p *Processor) aggregateThroughput(throughputs []*model.Throughput) service
t.Count += throughput.Count
t.Probabilities = merge(t.Probabilities, throughput.Probabilities)
} else {
aggregatedThroughput[service][operation] = throughput
Copy link
Contributor Author

@rbroggi rbroggi Oct 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a bug (unintended pointer copy makes so that the input argument throughputs is changed during this method execution). Could you please @yurishkuro / @albertteoh confirm whether this method is supposed to be immutable on its arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug hunted thanks to shuffle 😋

@stale
Copy link

stale bot commented Apr 16, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. You may re-open it if you need more time.

@stale stale bot added the stale The issue/PR has become stale and may be auto-closed label Apr 16, 2022
@stale
Copy link

stale bot commented Apr 30, 2022

This pull request has been automatically closed due to inactivity. You may re-open it if you need more time. We really appreciate your contribution and we are sorry that this has not been completed.

@stale stale bot closed this Apr 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale The issue/PR has become stale and may be auto-closed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable test shuffle for more robust test suite
2 participants