Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lease renewer logic #4090

Merged
merged 6 commits into from
Mar 19, 2018
Merged

Update lease renewer logic #4090

merged 6 commits into from
Mar 19, 2018

Conversation

jefferai
Copy link
Member

@jefferai jefferai commented Mar 7, 2018

It is believed by myself and members of the Nomad team that this logic
should be much more robust in terms of causing large numbers of new
secret acquisitions caused by a static grace period. See comments in the
code for details.

Fixes #3414

It is believed by myself and members of the Nomad team that this logic
should be much more robust in terms of causing large numbers of new
secret acquisitions caused by a static grace period. See comments in the
code for details.

Fixes #3414
@jefferai jefferai added this to the 0.9.6 milestone Mar 7, 2018
@jefferai
Copy link
Member Author

jefferai commented Mar 7, 2018

cc @preetapan @dadgar

dadgar
dadgar previously approved these changes Mar 15, 2018
api/renewer.go Outdated

// calculateGrace calculates the grace period based on a reasonable set of
// assumptions given the total lease time; it also adds some jitter to not have
// clients be in sync. We calculate this continuously so long as the new lease
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We calculate this continuously so long as the new lease
 +// duration is greater than the previous; no change means we don't need to
 +// recalculate, and if the lease duration keeps decreasing we've hit max and
 +// want to be able to rely on this.

This is actually done by the callers so shouldn't be on this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed that part of the comment.

@@ -184,6 +178,9 @@ func (r *Renewer) renewAuth() error {
return ErrRenewerNotRenewable
}

priorDuration := time.Duration(r.secret.LeaseDuration) * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be r.secret.Auth.LeaseDuration instead of r.secret.LeaseDuration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Good catch.

// We keep evaluating a new grace period so long as the lease is
// extending. Once it stops extending, we've hit the max and need to
// rely on the grace duration.
if leaseDuration > priorDuration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we calculate the new grace if the lease duration doesn't change, which could be a likely common thing? In other words, should this be leaseDuration >= priorDuration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the lease duration doesn't change, the grace period would be within the same parameters. Recalculating it would just shift the amount of random jitter, which if it's truly random won't either help or hurt, so can be skipped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

api/renewer.go Outdated

// The sleep duration is set to 2/3 of the current lease duration plus
// 1/3 of the current grace period, which adds jitter.
sleepDuration := time.Duration(float64(leaseDuration.Nanoseconds())*2/3 + float64(r.grace.Nanoseconds()*1/3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: The argument to time.Duration is of the form a + b. In that a has different parenthesis order in comparison with b. Can it be the same just for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this was subtly wrong, good catch.

api/renewer.go Outdated

// For a given lease duration, we want to allow 80-90% of that to elapse,
// so the remaining amount is the grace period
r.grace = time.Duration(leaseNanos*0.1) + time.Duration(uint64(r.random.Int63())%uint64(jitterMax))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the initialized jitterMax var as argument to the time.Duration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
case <-time.After(3 * time.Second):
case <-time.After(5 * time.Second):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 10 to be able to wait for maximum time possible before we error out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because for any given renewal it should use the default lease so we should expect activity within the default lease period, not max lease period.

case renew := <-v.RenewCh():
t.Logf("renew called, remaining lease duration: %d", renew.Secret.LeaseDuration)
continue outer
case <-time.After(5 * time.Second):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member

@vishalnayak vishalnayak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of your thoughts on the wait time in the test, this LGTM!

@jefferai
Copy link
Member Author

@preetapan Merging, but feel free to review after the fact.

@jefferai jefferai merged commit 9ca558c into master Mar 19, 2018
@jefferai jefferai deleted the grace-period-calc branch March 19, 2018 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants