Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client/v3: refresh the token when ErrUserEmpty is received while retrying #13262

Merged
merged 1 commit into from
Aug 26, 2021

Conversation

r-ashish
Copy link
Contributor

@r-ashish r-ashish commented Jul 31, 2021

A change was made in the PR #12549 to clear the auth token before attempting to get a new one, when ErrInvalidAuthToken is returned by the server.

if callOpts.retryAuth && rpctypes.Error(lastErr) == rpctypes.ErrInvalidAuthToken {
	// clear auth token before refreshing it.
	// call c.Auth.Authenticate with an invalid token will always fail the auth check on the server-side,
	// if the server has not apply the patch of pr #12165 (https://github.com/etcd-io/etcd/pull/12165)
	// and a rpctypes.ErrInvalidAuthToken will recursively call c.getToken until system run out of resource.
	c.authTokenBundle.UpdateAuthToken("")

	gterr := c.getToken(ctx)
...

However, if the getToken call also fails due to some reason (eg. context deadline exceeded) then it will leave the client without a token and subsequent retries will continue to fail with ErrUserEmpty unless the token is somehow refreshed.

This fix will refresh the token when ErrUserEmpty ("etcdserver: user name is empty") is returned by the server and the client has non-empty username, password.

@r-ashish
Copy link
Contributor Author

Also solves the Case 2 mentioned in: #10753, Case 1 was solved by #12549

@r-ashish r-ashish changed the title client/v3: refresh the token when ErrUserEmpty is recieved while retrying client/v3: refresh the token when ErrUserEmpty is received while retrying Jul 31, 2021
@davissp14
Copy link
Contributor

This is a pretty annoying issue and we seem to run into it fairly often. Would love to see this get merged.

@mitake
Copy link
Contributor

mitake commented Aug 19, 2021

Sorry I missed this PR, let me take a look later today.

Copy link
Contributor

@mitake mitake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @r-ashish !

@davissp14
Copy link
Contributor

davissp14 commented Aug 19, 2021

@mitake This is causing issues for a lot of our users. Can the release be expedited somehow?

@r-ashish
Copy link
Contributor Author

r-ashish commented Aug 20, 2021

@mitake This is causing issues for a lot of our users. Can the release of this somehow?

hey @davissp14, not sure if you've already tried, if not, how feasible it would be for you but you can try a workaround that worked for us until this is released -

Reinitialize the etcd client upon receiving this error which will re authenticate to get a new token. It was easy enough for us to do this as we have a wrapper which handles all the etcd related opts.

@davissp14
Copy link
Contributor

davissp14 commented Aug 20, 2021

@r-ashish We actually attempted to move to JWT tokens, which seemed to workaround this specific issue. However, turns out they are broken in a slightly more annoying way... #13300

We are using Etcd as the backend store for Stolon/PG clusters. That being said, it would be easier for us to just move to Consul than fork Stolon and rewire everything to accommodate this.

@mitake
Copy link
Contributor

mitake commented Aug 20, 2021

@davissp14 sorry for inconvenience, let me check the other issue. BTW CN based auth is simpler and well tested. Is it possible to try that auth method?

@davissp14
Copy link
Contributor

CN based auth isn't an option for us, sadly.

@mitake
Copy link
Contributor

mitake commented Aug 22, 2021

CN based auth isn't an option for us, sadly.

@davissp14 hmm I see, got it.

@spzala could you take a look at this?

Copy link
Member

@spzala spzala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @mitake
@r-ashish thanks, one quick review comment - can you please make sure if the changes is also needed here https://github.com/etcd-io/etcd/blob/main/client/v3/retry_interceptor.go#L250 (ref

@@ -149,6 +149,17 @@ func (c *Client) streamClientInterceptor(optFuncs ...retryOption) grpc.StreamCli
}
}

// shouldRefreshToken checks whether there's a need to refresh the token based on the error and callOptions,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @mitake
@r-ashish thanks and one quick review comment - can you please make sure if we should use this new func here as well https://github.com/etcd-io/etcd/blob/main/client/v3/retry_interceptor.go#L250 (ref to same PR that you have mentioned)

Copy link
Contributor Author

@r-ashish r-ashish Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spzala thanks for checking and pointing that out. I may be wrong but I think we already have the logic to get a new token for streams call on reconnect (retry_interceptor.go#L119) but would be good to use this new func at retry_interceptor.go#L250 as well.

Anyway, I'll update it soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @r-ashish @spzala !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mitake

…ying

To fix a bug in the retry logic caused when the auth token is cleared after receiving `ErrInvalidAuthToken` from the server and the subsequent call to `getToken` also fails due to some reason (eg. context deadline exceeded).
This leaves the client without a token and the retry will continue to fail with `ErrUserEmpty` unless the token is refreshed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants