-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase scheduler worker cancellation chan cap #741
Merged
pstibrany
merged 6 commits into
main
from
increase-frontend-scheduler-worker-cancellation-channel-capacity
Jan 14, 2022
Merged
Increase scheduler worker cancellation chan cap #741
pstibrany
merged 6 commits into
main
from
increase-frontend-scheduler-worker-cancellation-channel-capacity
Jan 14, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
With previous implementation, if worker was busy talking to scheduler, we didn't push the cancellation, keeping that query running. When cancelling a query, all its subqueries are cancelled at the same time, so this was most likely happening all the time (first subquery scheduled on this worker was canceled, the rest were not because worker was busy cancelling the first one). Also removed the `<-ctx.Done()` escape point when waiting for the enqueueing ACK and modified the enqueueing method to ensure that it always responds something. Fixes: #740 Inspired by: grafana/loki#5113 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
colega
added a commit
that referenced
this pull request
Jan 13, 2022
When scheduler worker is handling requests, if there's certain pressure on the loop, we can keep enqueueing requests instead of handling the cancellations. However, when there's that pressure on the loop, it's more important than never to properly cancel the requests. This checks the cancellations channel first, and then checks the requests channel too. Related to #741 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
3 tasks
pstibrany
reviewed
Jan 13, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
pstibrany
reviewed
Jan 13, 2022
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
pstibrany
approved these changes
Jan 14, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing my feedback!
Thanks Oleg ! and well done @kavirajk for finding this issue. |
pstibrany
deleted the
increase-frontend-scheduler-worker-cancellation-channel-capacity
branch
January 14, 2022 08:34
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
With previous implementation, if worker was busy talking to scheduler, we didn't push the cancellation, keeping that query running.
When cancelling a query, all its subqueries are cancelled at the same time, so this was most likely happening all the time (first subquery scheduled on this worker was canceled, the rest were not because worker was busy cancelling the first one).
Also removed the
<-ctx.Done()
escape point when waiting for the enqueueing ACK and modified the enqueueing method to ensure that it always responds something.Which issue(s) this PR fixes:
Fixes: #740
Inspired by: grafana/loki#5113
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]