You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What operating system (Linux, Windows, …) and version?
Mac, Linux
What did you do?
If possible, provide a recipe for reproducing the error.
What did you expect to see?
For DNS resolver, the default frequency of DNS resolution is 30 minutes, so DNS resolution happens every 30 minutes.
What did you see instead?
For DNS resolver, the minimum duration for re-resolution is set to 30 seconds by minDNSResRate. When Service Config is invalid, DNS resolution happens every 30 seconds. This happens even if service config feature is not used, which means txt lookup returns error.
This is because of polling that is introduced from v1.25.0 for error handling. The NewServiceConfig is called from resolver when service config is updated.
In DNS resolver, watcher calls NewServiceConfig. The polling calls ResolveNow to do immediate re-resolution, so the request of dns resolution is queued in dns resolver. The request also makes a call of NewServiceConfig, so it becomes infinite loop.
If a resolver calls NewServiceConfig as a result of ResolveNow, the resolver possibly becomes loops. I'm not sure which part is bad, but NewServiceConfig should not be called when service config is disabled and it is better to have backoff for re-resolution.
The text was updated successfully, but these errors were encountered:
kazegusuri
changed the title
Infinite loop of resolver when service config is not supported
Re-resolution loop when service config is not supported
Nov 20, 2019
OK, I see what's happening. When service config lookups are disabled, the resolver is returning the empty string as the service config, which is not valid JSON, so the ClientConn treats that as an error and asks for another update. I will make the DNS resolver stop reporting any service config when it's disabled, or when it receives an error during the TXT lookup. This will go back to the previous behavior. I will also make the ClientConn ignore the service config, regardless of whether there is an error, when service config lookups are disabled.
I also have a PR to update the DNS resolver to use the V2 API that implements the newer-style error handling requirements. #3165, in progress, and also blocked by #3186. Once done, an error during the TXT lookup should return an error to the ClientConn, and result in polling until no error is reported. This follows an exponential backoff starting at 1 second and eventually reaching 2 minutes. Because of the additional throttling in the DNS resolver, we will only allow it to refresh at most once every 30 seconds. But it will eventually start polling every 2 minutes after the backoff reaches its maximum.
What version of gRPC are you using?
v1.25.0 or later
What version of Go are you using (
go version
)?go version go1.13.4 darwin/amd64
What operating system (Linux, Windows, …) and version?
Mac, Linux
What did you do?
If possible, provide a recipe for reproducing the error.
What did you expect to see?
For DNS resolver, the default frequency of DNS resolution is 30 minutes, so DNS resolution happens every 30 minutes.
What did you see instead?
For DNS resolver, the minimum duration for re-resolution is set to 30 seconds by
minDNSResRate
. When Service Config is invalid, DNS resolution happens every 30 seconds. This happens even if service config feature is not used, which means txt lookup returns error.This is because of polling that is introduced from v1.25.0 for error handling. The
NewServiceConfig
is called from resolver when service config is updated.In DNS resolver, watcher calls NewServiceConfig. The polling calls
ResolveNow
to do immediate re-resolution, so the request of dns resolution is queued in dns resolver. The request also makes a call ofNewServiceConfig
, so it becomes infinite loop.If a resolver calls NewServiceConfig as a result of ResolveNow, the resolver possibly becomes loops. I'm not sure which part is bad, but NewServiceConfig should not be called when service config is disabled and it is better to have backoff for re-resolution.
The text was updated successfully, but these errors were encountered: