-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MULTIARCH-4294: Implement prometheus metrics for the pod placement controller #257
MULTIARCH-4294: Implement prometheus metrics for the pod placement controller #257
Conversation
@aleskandro: This pull request references MULTIARCH-4294 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
87e8bd2
to
9a6a768
Compare
Images provided under gcr.io/kubebuilder/ will be unavailable from March 18, 2025. Projects initialized with Kubebuilder versions v3.14 or lower utilize gcr.io/kubebuilder/kube-rbac-proxy to protect the metrics endpoint. Following the work in kubernetes-sigs/kubebuilder#4003, this commit removes the kube-rbac-proxy container and let the main container of the controller expose the metrics via HTTPS and by using the WithAuthenticatoinAndAuthorization filter. This also includes a minor fix in BuildService escaped during the resolution of some conflicts during a rebase. Related to kubernetes-sigs/kubebuilder#3871
f984ac3
to
82c5212
Compare
/test e2e-gcp-multi-operator-olm |
@aleskandro: This pull request references MULTIARCH-4294 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/unhold |
|
||
-- Distribution of the time to inspect an image | ||
sum by(le) (rate(mto_ppo_ctrl_time_to_inspect_pod_images_seconds_bucket[5m])) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would also add a sample query for users to set up alerts based on hitting SLOs. for example:
sum(rate(mto_ppo_ctrl_time_to_process_gated_pod_seconds_bucket{le="0.3"}[5m])) by (job)
/sum(rate(mto_ppo_wh_pods_gated_total[5m])) by (job) >=0.90
for a 90% SLO for a targeted duration of 300ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Prashanth684 , I would ignore AlertRule and suggestions of alerts in this PR. I was starting to look at them, but I would collect more data and have the inspection caching implemented first. Otherwise, we risk false positives alerts that pop up for the users easily.
7c0dfa4
to
ce2656f
Compare
/test e2e-gcp-multi-operator-olm |
…ing operator to get metrics
…mit updates to only when the resourceVersion changes
ce2656f
to
dfd55aa
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aleskandro The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-gcp-multi-operator-olm |
1 similar comment
/test e2e-gcp-multi-operator-olm |
/retest |
/override "Red Hat Konflux / multiarch-tuning-operator-enterprise-contract / multiarch-tuning-operator" |
@aleskandro: Overrode contexts on behalf of aleskandro: Red Hat Konflux / multiarch-tuning-operator-enterprise-contract / multiarch-tuning-operator In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test e2e-gcp-multi-operator-olm |
2 similar comments
@aleskandro: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
2b3f70d
into
openshift:main
This PR implements the Prometheus metrics for the Pod placement controller.
As the kube-rbac-proxy is a main component to serve metrics and it is being deprecated, this PR also addresses MULTIARCH-4989.
See kubernetes-sigs/kubebuilder#3871 for further details.
Metrics are documented in the docs/metrics.md file and an example Grafana Dashboard is provided here.
Finally, this PR implements some performance optimizations: