-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make query sharding deterministic #707
Conversation
Query sharding was executing queries concurrently and appending their results without any specific order. Unfortunately, basic mathematical operations on floats are not conmutative. Given float numbers a = 0.03298, b = 0.09894, the sum a+a+b differs from a+b+a. We can't fix float arithmetics, but at least we can make the result deterministic, so weird query results will be easier to debug. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@@ -156,6 +149,14 @@ func (q *shardedQuerier) Close() error { | |||
return nil | |||
} | |||
|
|||
func createJobIndexes(l int) []interface{} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in dskit
there's already concurrency.CreateJobsFromStrings
which is almost exactly the same as this, but with strings. Would it maybe make sense to also add concurrency.CreateJobsFromInts
there as well? (until we finally get generics :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that, but since generics are behind the corner, I preferred to keep this here until I see at least one more usage for it.
Also, IMO it would be just easier to make concurrency.ForEachJobID
like:
// ForEachJobID runs the provided jobFunc for each job ID in `[0, jobs)`.
// The execution breaks on first error encountered.
func ForEachJobID(ctx context.Context, jobs int, concurrency int, jobFunc func(ctx context.Context, job int) error) error
And then just doing input[job]
in the function instead of having to type-assert the interface or play with generics:
concurrency.ForEachJobID(ctx, len(input), someConcurrency, func(ctx context.Context idx int) error {
return process(input[idx])
})
I just checked and it seems that it would fit 100% of the concurrency.ForEach
usages removing all the type assertions.
If we like that I can open a PR in dskit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked and it seems that it would fit 100% of the concurrency.ForEach usages removing all the type assertions.
If this is true, then I think this solution would be clearly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets see what people think: grafana/dskit#113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That got merged. I'll update mimir once this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, I only added comments of minor importance
Co-authored-by: Mauro Stettler <mauro.stettler@gmail.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What this PR does:
Query sharding was executing queries concurrently and appending their results without any specific order. Unfortunately, basic mathematical operations on floats are not conmutative.
Given float numbers
a = 0.03298
,b = 0.09894
, the suma+a+b
differs froma+b+a
.We can't fix float arithmetics, but at least we can make the result deterministic, so weird query results will be easier to debug.
Which issue(s) this PR fixes:
None
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]