Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profiler: avoid metrics profile log noise when stopping profiling #2865

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nsrip-dd
Copy link
Contributor

@nsrip-dd nsrip-dd commented Sep 13, 2024

What does this PR do?

The metrics profiler insisted on at least one second between collections
for two reasons:

  • To avoid a division by zero because it was doing integer division to
    convert a time.Duration to seconds, which will truncate to 0, as a
    ratio in a subsequent computation
  • In case "a system clock issue causes time to run backwards"

The profiler would report an error if less than one second elapsed
between collections. In practice, this resulted in misleading error logs
because it's entirely likely for profiling to be stopped less than a
second after the last profile collection.

The restriction was not really even needed. For one, we can just do
floating-point division rather than integer division to avoid the
truncation problem.

Also, Go has had monotonic time support by default since 2017, added in
Go 1.9, and time comparison operations including time.Time.Sub, work
with respect to monotonic time. We shouldn't have any issues with
negative periods. We can ensure the period is positive just as a
defensive measure, and fail if it's negative since this may indicate a
bug in the Go runtime if it's violating the monotonicity guarantees.

Motivation

Reduce log noise. This log has been misleading in past escalations, with
users/support assuming the error was relevant to the actual issue being
investigated.

Fixes #2863

@pr-commenter
Copy link

pr-commenter bot commented Sep 13, 2024

Benchmarks

Benchmark execution time: 2024-09-19 18:53:55

Comparing candidate commit f5d8369 in PR branch nick.ripley/fix-metrics-profile-error with baseline commit e081e4a in branch main.

Found 0 performance improvements and 4 performance regressions! Performance is the same for 55 metrics, 0 unstable metrics.

scenario:BenchmarkInjectW3C-24

  • 🟥 execution_time [+205.722ns; +233.878ns] or [+5.091%; +5.787%]

scenario:BenchmarkSetTagMetric-24

  • 🟥 execution_time [+10.124ns; +12.316ns] or [+8.598%; +10.458%]

scenario:BenchmarkSetTagString-24

  • 🟥 execution_time [+6.049ns; +8.771ns] or [+5.359%; +7.771%]

scenario:BenchmarkSetTagStringer-24

  • 🟥 execution_time [+5.436ns; +8.204ns] or [+3.830%; +5.779%]

The metrics profiler insisted on at least one second between collections
for two reasons:

- To avoid a division by zero because it was doing integer division to
  convert a time.Duration to seconds, which will truncate to 0, as a
  ratio in a subsequent computation
- In case "a system clock issue causes time to run backwards"

The profiler would report an error if less than one second elapsed
between collections. In practice, this resulted in misleading error logs
because it's entirely likely for profiling to be stopped less than a
second after the last profile collection.

The restriction was not really even needed. For one, we can just do
floating-point division rather than integer division to avoid the
truncation problem.

Also, Go has had monotonic time support by default since 2017, added in
Go 1.9, and time comparison operations including time.Time.Sub, work
with respect to monotonic time. We shouldn't have any issues with
negative periods. We can ensure the period is positive just as a
defensive measure, and fail if it's negative since this may indicate a
bug in the Go runtime if it's violating the monotonicity guarantees.

Fixes #2863
@nsrip-dd nsrip-dd force-pushed the nick.ripley/fix-metrics-profile-error branch from 6273290 to f5d8369 Compare September 19, 2024 18:28
@nsrip-dd nsrip-dd marked this pull request as ready for review September 19, 2024 18:39
@nsrip-dd nsrip-dd requested a review from a team as a code owner September 19, 2024 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Error getting metrics profile: period between metrics collection is too small
1 participant