Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram buckets for connection metrics should be longer than request buckets #4922

Closed
samsp-msft opened this issue Oct 5, 2023 · 7 comments · Fixed by #5008
Closed

Histogram buckets for connection metrics should be longer than request buckets #4922

samsp-msft opened this issue Oct 5, 2023 · 7 comments · Fixed by #5008
Labels
enhancement New feature or request needs-spec-change Issues which require the OpenTelemetry Specification to clarify or define behavior

Comments

@samsp-msft
Copy link

Specific bucket definitions for request and connection duration based metrics

The OTel instrumentation libraries are adding a histogram bucket to the view for metrics. The bucket sizes are good for requests which should be very quick - it maxes out at 10s. With Http/1.1 and Http/2, a well behaved system will typically use a connection for multiple requests - and the longer the connection the better. So the histogram buckets used for requests don't scale up well for connections.

The proposal is to have a different bucket definition for connection metrics that tops out at a higher size, such as
[0, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60, 120, 300] based on a unit of seconds.

Which gives an upper bound of 5 mins. It could potentially be scaled higher, but if the goal is to measure http connection durations, then it should give a much better range. The primary concern for Http connections is to reduce the number of short-lived connections that are reestablished, rather than keeping a single one open and re-using it.

This bucketing should be applied to

@samsp-msft samsp-msft added the enhancement New feature or request label Oct 5, 2023
@utpilla
Copy link
Contributor

utpilla commented Oct 6, 2023

@samsp-msft Could you please also update the description of these metrics on the semantic conventions repo to include the histogram bucket information? You could refer to this as an example.

@samsp-msft
Copy link
Author

See #336

@cijothomas cijothomas added the needs-spec-change Issues which require the OpenTelemetry Specification to clarify or define behavior label Oct 6, 2023
@samsp-msft
Copy link
Author

@samsp-msft Could you please also update the description of these metrics on the semantic conventions repo to include the histogram bucket information? You could refer to this as an example.

I would suggest that rather than having specific bucket definitions for each histogram, that there are standardized bucket definitions that should be used based on the expected behavior of the metric. That should probably include:

  • Single request: used for HTTP requests, DNS lookup, database, RPC where the typical results will be in low ms, but could in extreme cases be seconds.
  • Semi-durable connection: used for Http & signalr connections, which can last from miliseconds through to several minutes or even hours.

@utpilla
Copy link
Contributor

utpilla commented Oct 17, 2023

I would suggest that rather than having specific bucket definitions for each histogram, that there are standardized bucket definitions that should be used based on the expected behavior of the metric.

@samsp-msft If you're proposing that this guidance should come from a more central place in the spec than docs/dotnet as proposed in this PR, then please open up an issue on the spec repo for that.

@utpilla
Copy link
Contributor

utpilla commented Oct 31, 2023

@JamesNK
Copy link
Contributor

JamesNK commented Nov 6, 2023

@utpilla What is the branch for 1.7.0? Is it main, or is there a 1.7.0 branch that the change needs to be ported to?

@utpilla
Copy link
Contributor

utpilla commented Nov 7, 2023

@utpilla What is the branch for 1.7.0? Is it main, or is there a 1.7.0 branch that the change needs to be ported to?
@JamesNK It's the main branch. The tentative date for 1.7.0 stable release is November 30.

IEvangelist added a commit to dotnet/docs that referenced this issue Feb 1, 2024
…ction histograms (#37811)

* Updating to include the metrics bucket information for open-telemetry/opentelemetry-dotnet#4922

* PR feedback

* Removing 0 bucket
Updated to square brackets to be in sync with the OTel docs.

---------

Co-authored-by: Sam Spencer <samsp@microsoft.com>
Co-authored-by: David Pine <david.pine@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-spec-change Issues which require the OpenTelemetry Specification to clarify or define behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants