-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update otelcontribcol to v0.106.0-gke.2 #1369
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@config-sync-bot: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest |
/retest-required |
The failure is due to the two For the first one we could disable the feature gate as an equivalent of using 0.0.0.0. The second one can be mitigated with a temporary feature gate disable, but still requires change in ConfigMap to match the style when the previous format is deprecated. |
d70cb8d
to
d54d160
Compare
@@ -231,6 +231,8 @@ spec: | |||
- ALL | |||
- args: | |||
- --config=/conf/otel-agent-config.yaml | |||
- --feature-gates=-component.UseLocalHostAsDefaultHost | |||
- --feature-gates=-pkg.translator.prometheus.NormalizeName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's pkg.translator.prometheus.NormalizeName for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for dropping the feature gate AKA disabling it, to revert to a simple fashion of name sanitizing, I realized metrics from reconilers and reconciler-manager are using this but not the resource-group-controller so adding it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this specific change in a different PR? Seems like a bug fix. Need to explain what it actually changes and what was broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just removed the flag. Took a closer look - resource group controller only has gauge and distribution metrics, and the names are only composed with letters and underscores, so even with the normalizer on, no conversion is happening. Plus the test cases are not checking the r-g-c metrics, so no observations if there's a mismatch in name somehow.
We could add the flag just to be safe and align with other deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add it in a new PR just for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhhh, I see, change was made a while back to actually enable this feature, but adding a -
disabled it instead. f3e2a5e
To re-enable it we could just remove the feature-gate flag as the feature is default on. But the trim & append will start to work and e2e test would fail (just ran a quick test).
From my reading of the doc it'd be better to have the normalizer running to convert the names for the favor of the Prometheus exporter/importer, which would break our e2e by a bit as it explicitly looks for the metric names defined and registered upon creation.
I lean keeping it as is and ignoring the naming, since any user with Prometheus operator would probably already hooked with the current naming.
Or we have a thorough & proper enablement of the feature.
WDYT @karlkfi
To upgrade to otelcontribcol 0.106.0 we need to mitigate two breaking changes in https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.104.0. The opencensus receiver depends on 0.0.0.0 so disabling feature gate component.UseLocalHostAsDefaultHost. With style `$FOO` deprecation, all env vars are changed to match the style. This brings a problem in the otel-agent container start when reconciler-manager and resource-group-controller do not initialize the following env vars: - `configsync_sync_kind` - `configsync_sync_name` - `configsync_sync_namespace` So splitting the otel-agent config between the reconciler and controllers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: config-sync-bot, karlkfi, nan-yu, sdowell The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
91e1b14
into
GoogleContainerTools:main
…rTools#1369)" This reverts commit 91e1b14. Reason for revert: This change breaks TestOtelCollectorDeployment
…erTools#1369)" (GoogleContainerTools#1376) Due to a refactor in the upstream, the metric transform processor no longer takes `label_set:[]` as a hint to remove all labels, instead all labels fall through and get exported. See [issue](open-telemetry/opentelemetry-collector-contrib#34430) for details. This change is to reapply the upgrade to 0.106.0 to catch up with latest otelcollector but adding the following workaround: A label `no_op_label` is given to the `label_set` of the metrics that need all labels removed. In this case the [re-slice function in the core lib](https://github.com/open-telemetry/opentelemetry-collector/blob/main/pdata/pcommon/slice.go#L125) will filter out all other valid labels of the metric. As for the no_op_label nothing will be performed as it's not part of the original attributes. This reverts commit 1d33f46.
Upgrading to latest for vulnerability fix.
To upgrade to otelcontribcol 0.106.0 we need to mitigate two breaking changes in https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.104.0.
The opencensus receiver depends on 0.0.0.0 so disabling feature gate component.UseLocalHostAsDefaultHost.
With style
$FOO
deprecation, all env vars are changed to match the style. This brings a problem in the otel-agent container start when reconciler-manager and resource-group-controller do not initialize the following env vars:configsync_sync_kind
configsync_sync_name
configsync_sync_namespace
So splitting the otel-agent config between the reconciler and controllers.