Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/httpcheck] Proposal: Add DNS Duration Timing Metric #34987

Open
StephanSalas opened this issue Sep 3, 2024 · 2 comments
Open

[receiver/httpcheck] Proposal: Add DNS Duration Timing Metric #34987

StephanSalas opened this issue Sep 3, 2024 · 2 comments
Labels
enhancement New feature or request needs triage New item requiring triage receiver/httpcheck HTTP Check receiver

Comments

@StephanSalas
Copy link

StephanSalas commented Sep 3, 2024

Component(s)

receiver/httpcheck

Is your feature request related to a problem? Please describe.

I understand that I am able to receive the following metrics from the current httpcheck receiver:

    description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
    enabled: true
    sum:
      value_type: int
      aggregation_temporality: cumulative
      monotonic: false
    unit: "1"
    attributes: [http.url, http.status_code, http.method, http.status_class]
  httpcheck.duration:
    description: Measures the duration of the HTTP check.
    enabled: true
    gauge:
      value_type: int
    unit: ms
    attributes: [http.url]
  httpcheck.error:
    description: Records errors occurring during HTTP check.
    enabled: true
    sum:
      value_type: int
      aggregation_temporality: cumulative
      monotonic: false
    unit: "{error}"
    attributes: [http.url, error.message]

However, these existing metrics do not give us much visibility into the components of the http call and their latencies. I would like to scope this Feature request to just measuring the DNS Portion of the Layer-7 connection. I want to understand how long it takes for DNS host-lookups to respond, and this is particularly important for high-performance api-systems.

Describe the solution you'd like

Add a DNS Lookup duration metric named httpcheck.dnslookup_duration, e.g:

    httpcheck.dnslookup_duration:
      description: Measures the duration of the DNS Component of the HTTP check.
      enabled: true
      gauge:
        value_type: int
      unit: ms
      attributes: [http.url]

This lookup duration metric only is visible if the httpcheck endpoint is actually a host (fqdn). If the endpoint is an IP Address, then it will not return this value because of course in that particular case, there is no dns lookup being performed. This new field can also be an "optional" metric, such that the user can control whether or not it is enabled.

Describe alternatives you've considered

Not doing this would present no other option but not to know which component of the Layer-7 Call Flow is causing the latency. If it is DNS causing the majority of the latenecy, we are in the dark that this is the case in the current status-quo of this receiver.

Additional context

I have a pull request ready that can do this. I will reference this issue and go from there.

@StephanSalas StephanSalas added enhancement New feature or request needs triage New item requiring triage labels Sep 3, 2024
@github-actions github-actions bot added the receiver/httpcheck HTTP Check receiver label Sep 3, 2024
Copy link
Contributor

github-actions bot commented Sep 3, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@StephanSalas StephanSalas changed the title HttpCheck Receiver - Add DNS Timing Metric [receiver/httpcheck] Proposal: Add DNS Timing Metric Sep 3, 2024
@StephanSalas StephanSalas changed the title [receiver/httpcheck] Proposal: Add DNS Timing Metric [receiver/httpcheck] Proposal: Add DNS Duration Timing Metric Sep 3, 2024
@StephanSalas
Copy link
Author

StephanSalas commented Sep 4, 2024

@codeboten , when you get a chance... Any thoughts on this idea? I think this is something that would both be a minimal change, but also be a great incremental value to the receiver since many HTTP calls rely on DNS on Layer 7.

In the future, would also love to add duration metric for the TCP/TLS Connection dial, but that may be a bit overkill. I will not add this question to this specific issue scope but I am curious your thoughts on that as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage receiver/httpcheck HTTP Check receiver
Projects
None yet
Development

No branches or pull requests

1 participant