Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch-all Regex for JXM -> Prom Exporter #12073

Merged
merged 5 commits into from
Jan 16, 2024

Conversation

suddendust
Copy link
Contributor

Recently, we detected that the catch-all label was incorrect - It failed to capture a metric for which a specific regex had not been added. It also does not capture metrics that contain the kafka topic in the name. This PR fixes these problems. Commented regexes are captured with catch-all, I have kept them as they can come in handy later.

Commented metrics are exported successfully:

Screenshot 2023-11-30 at 12 37 27 PM Screenshot 2023-11-30 at 12 38 46 PM Screenshot 2023-11-30 at 12 39 44 PM Screenshot 2023-11-30 at 12 40 03 PM Screenshot 2023-11-30 at 12 40 19 PM Screenshot 2023-11-30 at 12 40 32 PM Screenshot 2023-11-30 at 12 40 47 PM

# Patterns after this line may be skipped.
- pattern: "\"?org\\.apache\\.pinot\\.common\\.metrics\"?<type=\"?\\w+\"?, name=\"?pinot\\.(\\w+)\\.(\\w+)_(OFFLINE|REALTIME)\\.(\\w+)\"?><>(\\w+)"
name: "pinot_$1_$4_$5"
- pattern: "\"?org\\.apache\\.pinot\\.common\\.metrics\"?<type=\"?\\w+\"?, name=\"?pinot\\.(\\w+)\\.(\\w+)\\.(\\w+)_(OFFLINE|REALTIME)\\.(\\w+)\"?><>(\\w+)"
Copy link
Contributor

@KKcorps KKcorps Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the existing catch all? I feel like the old one might be capturing something which may get lost with this new

It's better imo to include this one as a seperate rule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed.

@Jackie-Jiang
Copy link
Contributor

Out of the scope of this PR, will these regex increase the memory footprint of the metrics? I've been observing saw-shape memory usage on servers without query load, and suspecting that is caused by the metrics. #8232 listed some previous work to reduce the memory footprint, which might provide some insights.

Copy link
Contributor

@zhtaoxiang zhtaoxiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update docker/images/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml file

@@ -157,15 +157,37 @@ rules:

## Metrics that fit the catch-all patterns above should not be added to this file.
## In case a metric does not fit the catch-all patterns, add them before this comment

# This is a catch-all pattern for pinot table metrics with offline/realtime suffix.
# This is a catch-all pattern for pinot table metrics with offline/realtime suffix without kafka topic
# Patterns after this line may be skipped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel that we should add this comment for each catch-all pattern so that people know the potential risks?

table: "$1"
tableType: "$2"
partition: "$3"
#- pattern: "\"org.apache.pinot.common.metrics\"<type=\"ServerMetrics\", name=\"pinot.server.llcPartitionConsuming.([^\\.]*?)_(OFFLINE|REALTIME)\\-(.+)\\-(\\w+)\"><>(\\w+)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to confirm: with those newly-added patterns, those commented out ones will remain the same, right? We don't want to introduce breaking changes 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that has been tested. Have explained in the PR description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented metrics are exported successfully

@suddendust
Copy link
Contributor Author

  • Added these catch-all regexes in pinot.yml.
  • Have tested all commented regexes by running the exporter locally, all such metrics are being exported.

Comment on lines 34 to 41
#- pattern: "\"org.apache.pinot.common.metrics\"<type=\"ServerMetrics\", name=\"pinot.server.highestStreamOffsetConsumed.([^\\.]*?)_(OFFLINE|REALTIME)\\-(.+)\\-(\\w+)\"><>(\\w+)"
# name: "pinot_server_highestStreamOffsetConsumed_$5"
# cache: true
# labels:
# table: "$1"
# tableType: "$2"
# topic: "$3"
# partition: "$4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this? can we delete?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and also can we remove other commented code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gortiz
Copy link
Contributor

gortiz commented Jan 16, 2024

Out of the scope of this PR, will these regex increase the memory footprint of the metrics? I've been observing saw-shape memory usage on servers without query load, and suspecting that is caused by the metrics. #8232 listed some previous work to reduce the memory footprint, which might provide some insights.

It is difficult to analyze why #8232 increased efficiency given it:

  1. Doesn't seem to include the older rules (which seem to be hardcoded before Tune prometheus config #8232.
  2. Adds one rule per metric but also adds cache. In the

I would say it is quite more expensive to process hundreds of very large regexps with the same prefix than a couple of them with correct quantifiers. And it is obviously easier for us to write.

@walterddr walterddr merged commit ae55a7a into apache:master Jan 16, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants