Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP timeout issue #401

Closed
Y-Tian opened this issue Mar 10, 2020 · 6 comments
Closed

TCP timeout issue #401

Y-Tian opened this issue Mar 10, 2020 · 6 comments

Comments

@Y-Tian
Copy link

Y-Tian commented Mar 10, 2020

[Logs]:

Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.015 [DEBUG] plain handler: new tcp connection from 127.0.0.1:33410
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.017 [DEBUG] plain handler for 127.0.0.1:33410 returned. closing conn
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.376 [DEBUG] GrafanaNet sent metrics in 71.701728ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.386 [DEBUG] GrafanaNet sent metrics in 81.80253ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.388 [DEBUG] GrafanaNet sent metrics in 85.61965ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.394 [DEBUG] GrafanaNet sent metrics in 92.004397ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.396 [DEBUG] GrafanaNet sent metrics in 92.756409ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.397 [DEBUG] GrafanaNet sent metrics in 94.793193ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.404 [DEBUG] GrafanaNet sent metrics in 102.231334ms -msg size 2
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.408 [DEBUG] GrafanaNet sent metrics in 104.683852ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.409 [DEBUG] GrafanaNet sent metrics in 105.30543ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.409 [DEBUG] GrafanaNet sent metrics in 105.159224ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.410 [DEBUG] GrafanaNet sent metrics in 106.679546ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.414 [DEBUG] GrafanaNet sent metrics in 110.317278ms -msg size 1
Mar  9 23:49:55 vagrant carbon-relay-ng[18688]: 2020-03-09 23:49:55.455 [DEBUG] GrafanaNet sent metrics in 152.397932ms -msg size 1
Mar  9 23:50:00 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:00.000 [DEBUG] stats flushing for 2020-03-09 23:50:00 +0000 UTC m=+395.121794354 to graphite
Mar  9 23:50:03 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:03.818 [WARNING] stats dialing 192.0.2.0:2003 failed: dial tcp 192.0.2.0:2003: connect: connection timed out. will retry
Mar  9 23:50:05 vagrant nodejs[18721]: Flushing stats at  Mon Mar 09 2020 23:50:05 GMT+0000 (Coordinated Universal Time)
Mar  9 23:50:05 vagrant nodejs[18721]: { counters:
Mar  9 23:50:05 vagrant nodejs[18721]:    { 'statsd.bad_lines_seen': 0,
Mar  9 23:50:05 vagrant nodejs[18721]:      'statsd.packets_received': 0,
Mar  9 23:50:05 vagrant nodejs[18721]:      'statsd.metrics_received': 0 },
Mar  9 23:50:05 vagrant nodejs[18721]:   timers: {},
Mar  9 23:50:05 vagrant nodejs[18721]:   gauges: { 'statsd.timestamp_lag': 0 },
Mar  9 23:50:05 vagrant nodejs[18721]:   timer_data: {},
Mar  9 23:50:05 vagrant nodejs[18721]:   counter_rates:
Mar  9 23:50:05 vagrant nodejs[18721]:    { 'statsd.bad_lines_seen': 0,
Mar  9 23:50:05 vagrant nodejs[18721]:      'statsd.packets_received': 0,
Mar  9 23:50:05 vagrant nodejs[18721]:      'statsd.metrics_received': 0 },
Mar  9 23:50:05 vagrant nodejs[18721]:   sets: {},
Mar  9 23:50:05 vagrant nodejs[18721]:   pctThreshold: [ '90', '95', '99' ] }
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.008 [DEBUG] plain handler: new tcp connection from 127.0.0.1:33414
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.010 [DEBUG] plain handler for 127.0.0.1:33414 returned. closing conn
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.399 [DEBUG] GrafanaNet sent metrics in 72.260998ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.406 [DEBUG] GrafanaNet sent metrics in 80.667469ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.410 [DEBUG] GrafanaNet sent metrics in 84.23987ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.422 [DEBUG] GrafanaNet sent metrics in 95.515774ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.423 [DEBUG] GrafanaNet sent metrics in 97.165266ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.423 [DEBUG] GrafanaNet sent metrics in 97.654528ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.443 [DEBUG] GrafanaNet sent metrics in 117.550005ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.444 [DEBUG] GrafanaNet sent metrics in 118.079867ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.445 [DEBUG] GrafanaNet sent metrics in 119.07257ms -msg size 2
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.445 [DEBUG] GrafanaNet sent metrics in 118.893423ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.446 [DEBUG] GrafanaNet sent metrics in 119.770423ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.447 [DEBUG] GrafanaNet sent metrics in 119.983126ms -msg size 1
Mar  9 23:50:05 vagrant carbon-relay-ng[18688]: 2020-03-09 23:50:05.448 [DEBUG] GrafanaNet sent metrics in 121.715464ms -msg size 1
Mar  9 23:50:07 vagrant carbon-relay-ng[18688]: 2020/03/09 23:50:07 dial tcp 192.0.2.0:2003: connect: connection timed out

As you can see above, there exists a recurring timeout (exactly every 2m).

I thought this would be related to issue#250 and PR#318, so I changed this carbon-relay-ng.ini files plain_read_timeout and pickle_read_timeout to "0s".

However, I still get the TCP timeout error.

Is this regular behavior (aka every single connection creates a new socket and each socket automatically times out after 2m)? Otherwise, looking for advice on how to fix this.

@Dieterbe
Copy link
Contributor

Hello Yuda Tian,
the timeouts you see are on an outbound carbon/tcp connection to 192.0.2.0:2003, this would be either a carbon route to that address, or instrumentation.graphite_addr if you have configured it (please confirm which one)

the variables you mention control timeouts on inbound connections, and are thus unrelated.

@Y-Tian
Copy link
Author

Y-Tian commented Mar 10, 2020

Right, I'm trying to discard the metrics from instrumentation (by sending it to a reserved IP for discarding), I think it's referenced here:

#50

AFAIK there doesn't seem to be any support for this yet?

@Dieterbe
Copy link
Contributor

that's correct. the simple solution is to simply send the stats back into the relay, which is very handy anyway.

@Y-Tian
Copy link
Author

Y-Tian commented Mar 10, 2020

So I would be setting: graphite_addr = "0.0.0.0:2003", sending the stats back to the listen_addr?

@Dieterbe
Copy link
Contributor

yep

@Dieterbe
Copy link
Contributor

as this was a config issue, i'll consider this resolved.
FWIW the documentation around this was recently improved: #456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants