Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve github pull request call retries #1810

Merged
merged 2 commits into from
Oct 4, 2021
Merged

Improve github pull request call retries #1810

merged 2 commits into from
Oct 4, 2021

Conversation

aristocrates
Copy link
Contributor

@aristocrates aristocrates commented Sep 15, 2021

Retry with fixed 1 second backoff up to 3 retries was added by #1131 to address #1019, but the issue continued to show up (#1453).

Increase max attempts to 5 and use exponential backoff for a maximum total retry time of (2^n - n - 1) seconds, which is roughly 30 seconds for current max attempts n = 5.

Also move the sleep to the top of the loop so that we never sleep without sending the request again on the last iteration.

I decided not to change the test case added in #1131 to avoid sleeping the full time in the test suite - I can bump the retries there to 5 to keep it in sync if that would be better.

I can also add jitter so the backoff isn't deterministic / less vulnerable to thundering herds but didn't for now since there's already existing retry logic in this exact place that is deterministic and linear and making it exponential could be done with a small change without pulling in additional dependencies.

Also would it be good to expose configuration options for this and make the default behavior match current with max wait 3 seconds? (like what is proposed in #1715, though the problem here is not waiting long enough to retry rather than throttling) I'm not entirely sure what the implications are for a longer wait here with concurrent PRs that need planning, etc (I think it's fine for my use case, not as confident it's fine for everyone's)

Retry with fixed 1 second backoff up to 3 retries was added by #1131 to
address #1019, but the issue continued to show up (#1453).

Increase max attempts to 5 and use exponential backoff for a maximum
total retry time of (2^n - n - 1) seconds, which is roughly 30 seconds
for current max attempts n = 5.

Also move the sleep to the top of the loop so that we never sleep
without sending the request again on the last iteration.
@aristocrates aristocrates requested a review from a team as a code owner September 15, 2021 18:10
retryDelay := 1 * time.Second
for i := 0; i < numRetries; i++ {
// to attempt up to 5 times with exponential backoff.
maxAttempts := 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(first attempt technically isn't a retry so I changed the var names to use attempt)

@lnattrass
Copy link

Is there any maintainers around - this looks relatively straightforward?

Copy link
Contributor

@jamengual jamengual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jamengual jamengual merged commit 875e2ad into runatlantis:master Oct 4, 2021
@aristocrates aristocrates deleted the retry_for_longer_in_github_client branch October 4, 2021 19:25
elementalvoid added a commit to elementalvoid/atlantis that referenced this pull request Jan 21, 2022
This is a follow on to resolve similar issues to runatlantis#1019.

In runatlantis#1131 retries were added to GetPullRequest. And in runatlantis#1810 a backoff
was included.

However, those only resolve one potential request at the very
beginning of a PR creation. The other request that happens early on
during auto-plan is one to ListFiles to detect the modified files. This
too can sometimes result in a 404 due to async updates on the GitHub
side.
krrrr38 pushed a commit to krrrr38/atlantis that referenced this pull request Dec 16, 2022
* Improve github pull request call retries

Retry with fixed 1 second backoff up to 3 retries was added by runatlantis#1131 to
address runatlantis#1019, but the issue continued to show up (runatlantis#1453).

Increase max attempts to 5 and use exponential backoff for a maximum
total retry time of (2^n - n - 1) seconds, which is roughly 30 seconds
for current max attempts n = 5.

Also move the sleep to the top of the loop so that we never sleep
without sending the request again on the last iteration.

* Fix style with gofmt -s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants