Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create metrics in a queue which can be processed in advance of the write queue #960

Merged
merged 2 commits into from
Aug 24, 2024

Conversation

bucko909
Copy link
Contributor

@bucko909 bucko909 commented Aug 21, 2024

We've had this on a moderately sized production system for some time now and haven't spotted any problems. I started with work on #888 by @ploxiln and the original work by @piotr1212, but this approach is slightly different, and happens to have a slightly more readable diff. Differences:

  • We don't drop excessive creates, instead we just queue them like with updates. This matters a lot on our setup, where large numbers of metrics often get created in bursts, and we'd rather not lose data from them.
  • It's (more) robust to database files getting deleted under its hood. This will cause a droppedCreate rather than an exception, I think.

There are a few behaviour changes:

  • We'll create new metrics at the maximum rate possible until there are none left to create. The create bucket should likely therefore be tuned not to be able to overwhelm the write bandwidth -- though if this happens, the cache is going to be unable to keep up anyway.
    • If the update bucket is empty, this can delay creates, as the thread can be blocked there for one update even if a create is possible first. This is unlikely to delay creates by very much.
  • Empty Whisper files will now be created, as the create step doesn't write data into the resulting file.
  • Creates are now dropped only when the cache tries to fill the new metric and finds its absent. This should be exceedingly unlikely under most of the write strategies.
  • If a "live" whisper file is deleted, we'll drop all datapoints and log a dropped create when we attempt to write to it next. If there are further datapoints after this, the metric should be recreated. The old behaviour would have created the file on the write attempt.

Deprecates #888

Fixes graphite-project/graphite-web#629

The old behaviour would increase _tokens on every call to drain; now we
will do it only if there are not enough to service the request. As a
side-effect, we now have .peek(), which is cheap, non-blocking and
non-destructive.
Based on a patch at graphite-project#888
by ploxiln; I've fixed the blocking issue and also made it drop metrics
only if it comes to write them after continually failing to have enough
create tokens.

If carbon is configured with a create rate of 0, this deque will grow
indefinitely. In pretty much all other circumstances, behaviour should
be pretty similar.
@deniszh
Copy link
Member

deniszh commented Aug 24, 2024

Looks great, merging this

@deniszh deniszh merged commit ef419d0 into graphite-project:master Aug 24, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metric queries return null data until whisper file exists
2 participants