Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional removal of metrics from Prometheus PushGateway on shutdown #14935

Conversation

BartMiki
Copy link
Contributor

@BartMiki BartMiki commented Aug 31, 2023

Description

This PR adds an optional removal of Prometheus PushGateway metrics after the Prometheus emitter shutdown. The rationale for this change is that the PushGateway will remember all of the pushed metrics even if the batch job was finished a long time ago. This caused an issue with one of our clients as the PushGateway stored too many stale time series, that were already ingested by Prometheus.

We fix this issue within this repository https://github.com/deep-bi/druid-prometheus-emitter and we want to merge our fix with the Druid source code.

This PR only modifies the code of the Prometheus emitter extension. We extend a configuration with one more property and we add this information to the extension README.

Release note

Druid operators can specify optional prometheus emitter configs to delete metrics from the push gateway on task shutdown when the pushgateway strategy is configured:

  1. druid.emitter.prometheus.deletePushGatewayMetricsOnShutdown: When set to true, the peon tasks deletes metrics from the prometheus push gateway on task shutdown. Default value is false.
  2. druid.emitter.prometheus.waitForShutdownDelay: Time in milliseconds to wait for peon tasks to delete metrics from the Pushgateway on shutdown (e.g. 60_000). Applicable only when druid.emitter.prometheus.deletePushGatewayMetricsOnShutdown is set to true. Default value is none - i.e., there is no delay between peon task shutdown and metrics deletion from the push gateway.

Key changed/added classes in this PR
  • PrometheusEmitter
  • PrometheusEmitterConfig

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, @BartMiki !
I have left some minor comments, otherwise the changes look good to me.

| property | description | required? | default |
|--------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|--------------------------------------|
| `druid.emitter.prometheus.strategy` | The strategy to expose prometheus metrics. <br/>Should be one of `exporter` and `pushgateway`. Default strategy `exporter` would expose metrics for scraping purpose. Peon tasks (short-lived jobs) should use `pushgateway` strategy. | yes | exporter |
| `druid.emitter.prometheus.port` | The port on which to expose the prometheus HTTPServer. Required if using `exporter` strategy. | no | none |
Copy link
Contributor

@abhishekrb19 abhishekrb19 Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please revert this table formatting change? It makes it hard to see what changed in the diff. If you sync master, it has Intellij settings for FORMAT_TABLES that will not format and wrap long lines in the table automatically - https://github.com/apache/druid/blob/master/dev/druid_intellij_formatting.xml#L78

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I add this in the following commit

Copy link
Contributor

@abhishekrb19 abhishekrb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I left a few comments and test cases to add. Thanks, @BartMiki!

@BartMiki
Copy link
Contributor Author

BartMiki commented Dec 8, 2023

@abhishekrb19 Hi! I update the PR based on your comments

Copy link
Contributor

@abhishekrb19 abhishekrb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some minor comments. Thanks, @BartMiki!

@BartMiki
Copy link
Contributor Author

@abhishekrb19 Thank you for the review! I've addressed your comments. Let me know if there is something else to change.

@abhishekrb19
Copy link
Contributor

@BartMiki, looks good -- thanks for the patch and addressing review feedback!

@abhishekrb19 abhishekrb19 merged commit 4670a76 into apache:master Dec 13, 2023
83 checks passed
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants