Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new metrics for http client #17138

Open
Pankaj260100 opened this issue Sep 24, 2024 · 0 comments
Open

Add new metrics for http client #17138

Pankaj260100 opened this issue Sep 24, 2024 · 0 comments

Comments

@Pankaj260100
Copy link
Contributor

Description

We can add metrics on the client side to gain the visibility like we have for the Jetty HTTP server.

Motivation

Recently, we experienced high latency issues and struggled to pinpoint the exact bottleneck. After a thorough analysis of the query lifecycle within Druid, we identified that one potential contributor to latency is the time taken by the HTTP client at the broker/router. During this process, threads can be blocked while waiting for a connection to become available.

Currently, we lack visibility into how long requests are waiting for a connection on the client side. By exposing a metric for the time requests spend waiting for a connection on the client side, we can identify whether the connection acquisition time is a significant bottleneck and allow us to take targeted actions to mitigate it.

Additionally, we have observed at the router that log lines indicate it hits the limit of 1024 at the Jetty HTTP client queue. This suggests that the queue size can become a limiting factor, potentially leading to increased latency or dropped requests. We suspect a similar situation could occur at the broker, where the NettyHTTP client might also experience high queue sizes. By adding a metric for the queue size or time spent by request at the broker, we can monitor this aspect and ensure that it does not become a hidden bottleneck.

In summary, adding these metrics on the client side will provide us with critical insights into the connection acquisition process and queue sizes, enabling us to more effectively identify and address latency bottlenecks within Druid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant