Define semantic conventions and instrumentation stability #2180

tigrannajaryan · 2021-11-30T23:52:38Z

This PR defines the rules for SDKs in all languages with regards to marking
instrumentations as "Stable" and "Unstable" and how such instrumentations
are allowed to evolve over time.

Summary of changes:

Explain that schemas define what changes are allowed.
Clarify that additive changes are an exception and are always allowed.
Prohibit instrumentation changes until April 1, 2023 to allow recipients
(backends/vendors) to start understanding Schema URLs/Schema Files.
This gives one year advance notice, which I think should be enough for
all parties to prepare.

specification/versioning-and-stability.md

jsuereth

ADDITIVE changes to metrics can break user alerts by creating new timeseries and adjusting thresholds. I think we need to be a lot more careful around changes, particularly around metrics here.

tigrannajaryan · 2021-12-01T17:02:47Z

ADDITIVE changes to metrics can break user alerts by creating new timeseries and adjusting thresholds. I think we need to be a lot more careful around changes, particularly around metrics here.

@jsuereth I can change the PR to explicitly prohibit this but I think what you suggest contradicts with an existing understanding that attributes are optional, see this discussion #2077

I don't know which one is correct, I'll let metric expert to decide and then I am happy to adjust this PR according to that.

tigrannajaryan · 2021-12-01T19:12:21Z

ADDITIVE changes to metrics can break user alerts by creating new timeseries and adjusting thresholds. I think we need to be a lot more careful around changes, particularly around metrics here.

@jsuereth I modified the PR to disallow addition of new attributes to existing metrics. PTAL.

Changed the PR to avoid the blocking issue.

jsuereth

thanks for resolving my blocking comment! I still have some concerns, but not as strongly blocking :)

For clarity - I agree with the sentiment and goal of this PR and approve, just nits on details/scope of what's stated here, including implications on instrumentation development.

specification/versioning-and-stability.md

trask · 2021-12-10T03:45:54Z

Can you clarify how this applies to auto-instrumentation distributions?

Should we hide all telemetry producing auto-instrumentation in a distribution behind an experimental flag, and only enable them by default once their respective semantic conventions have reached stability and we can commit to telemetry stability?

tigrannajaryan · 2021-12-10T13:59:13Z

Can you clarify how this applies to auto-instrumentation distributions?

Should we hide all telemetry producing auto-instrumentation in a distribution behind an experimental flag, and only enable them by default once their respective semantic conventions have reached stability and we can commit to telemetry stability?

No, I believe it is fine to produce telemetry from auto-instrumentation. This PR only says that after you declare your auto-instrumentation "Stable" you cannot change the produced telemetry until certain conditions are met (SchemaURL in the output and wait until certain date). If the text in the PR is unclear I will fix it. Let me know which part you find unclear.

specification/versioning-and-stability.md

trask · 2021-12-10T19:59:40Z

Can you clarify how this applies to auto-instrumentation distributions?
Should we hide all telemetry producing auto-instrumentation in a distribution behind an experimental flag, and only enable them by default once their respective semantic conventions have reached stability and we can commit to telemetry stability?

No, I believe it is fine to produce telemetry from auto-instrumentation. This PR only says that after you declare your auto-instrumentation "Stable" you cannot change the produced telemetry until certain conditions are met (SchemaURL in the output and wait until certain date). If the text in the PR is unclear I will fix it. Let me know which part you find unclear.

For individual instrumentations, we use the version number to "clearly mark" them as unstable, e.g. 0.x or 1.x-alpha.

For distributions (e.g. the opentelemetry javaagent), should we be marking them as "unstable" (e.g. 0.x or 1.x-alpha) if they emit any unstable instrumentation?

tigrannajaryan · 2021-12-10T20:23:18Z

For individual instrumentations, we use the version number to "clearly mark" them as unstable, e.g. 0.x or 1.x-alpha.

For distributions (e.g. the opentelemetry javaagent), should we be marking them as "unstable" (e.g. 0.x or 1.x-alpha) if they emit any unstable instrumentation?

The "unstable" vs "stable" labeling of a library or of a package (or individual instrumentations if that's the granularity of labeling you choose to maintain) is done based on the fact whether you are able to take the responsibility to fulfill the requirements of the "stable" labeling. What "stable" label means is described in this document.

Particularly with regards to telemetry emitted by the instrumentation this document (after accepting this PR) says that if you label the instrumentation "stable" then you must fulfil the following obligation:

If you are not including Schema URL in the emitted telemetry then you are not allowed to modify the telemetry (with certain exceptions carved out in the doc).
If you are including Schema URL then you are allowed to modify the telemetry after certain date, provided that they are described by a published Schema File (with the same exceptions as above applicable).

If you are not labeling the instrumentation as "stable" then you free to do whatever you want. Any changes are fine.

trask · 2021-12-10T20:37:55Z

The "unstable" vs "stable" labeling of a library or of a package (or individual instrumentations if that's the granularity of labeling you choose to maintain) is done based on the fact whether you are able to take the responsibility to fulfill the requirements of the "stable" labeling. What "stable" label means is described in this document.

Thanks @tigrannajaryan. Given that we don't want to mark the entire javaagent distribution as "unstable", it sounds like we will need to disable all "unstable" auto-instrumentation in the distribution by default, and only enable them (by default) once their respective semantic conventions have reached stability and we can commit to the telemetry stability described in this document.

jmacd

There are a number of questions about metrics here. The Views support in each metrics SDK should show it's possible to perform attribute removal safely (correctly, and relatively easily compared with doing so after-the-fact). This topic came up in #2208. I expect we will eventually be able to support metrics schema translation via Views and relax these restrictions, but it's good to be conservative for now.

specification/telemetry-stability.md

tigrannajaryan · 2022-04-04T14:26:07Z

@Oberon00 please take another look.

tigrannajaryan · 2022-04-04T14:30:12Z

@open-telemetry/python-maintainers can you please take a look? This affects instrumentations in all languages. It would be good to have your approval if you agree with this approach.

tigrannajaryan · 2022-04-04T19:43:30Z

@dyladan if the PR looks good to you please approve on behalf of JS SIG. I want to have an approval from each (or at least most) language SIG.

dyladan

We discussed this at the JS meeting last wednesday and the other @open-telemetry/javascript-maintainers brought up the same points as have been brought up by myself, @alanwest, and others around instrumentation which is not in the semantic conventions. There will always be some instrumentations which for whatever reason are not covered by the semantic conventions whether they have too small of a userbase, are too specific, or provide some custom functionality. This PR seems to prevent those instrumentations from ever being marked as "stable" which will limit their adoption.

From a maintenance standpoint it is not too big of a burden for maintainers to gate stable on the guarantees outlined here, but I believe the restrictions will prevent people from upstreaming contributions to the contrib repo unless we have some way to provide custom functionality that isn't covered by the semantic conventions.

As mentioned earlier, this is still marked experimental so isn't blocking, but I want to make sure the point isn't forgotten before this is finalized and made stable.

tigrannajaryan · 2022-04-04T20:26:19Z

From a maintenance standpoint it is not too big of a burden for maintainers to gate stable on the guarantees outlined here, but I believe the restrictions will prevent people from upstreaming contributions to the contrib repo unless we have some way to provide custom functionality that isn't covered by the semantic conventions.

As mentioned earlier, this is still marked experimental so isn't blocking, but I want to make sure the point isn't forgotten before this is finalized and made stable.

Make sense. I will think about we can make this explicitly allowed. To be clear, even in this current form we do not completely prevent Stable implementations which break the rules, since I used the "SHOULD" clause, so there is always a possibility to ignore it if there is a good reason. Anyway, I will file a github issue to take care of this, if this PR gets merged (I am not yet sure, so I don't want to open an issue about a document that doesn't exist in the spec yet).

tigrannajaryan · 2022-04-04T20:27:07Z

I believe so far we have approvals from JS, Go, Java, .Net maintainers.

trask · 2022-04-04T21:08:28Z

There will always be some instrumentations which for whatever reason are not covered by the semantic conventions whether they have too small of a userbase, are too specific, or provide some custom functionality. This PR seems to prevent those instrumentations from ever being marked as "stable" which will limit their adoption.

@tigrannajaryan I thought there were two options for instrumentation that are not covered by semantic conventions to go stable:

Fixed Schema Telemetry Producers - decide not to evolve the schema via the standard schema mappings
Schema-File Driven Telemetry Producers - publish your own schema file (we have discussed doing this in Java for these one-off instrumentations that do not have a future in the specification repo)

tigrannajaryan · 2022-04-04T21:51:06Z

There will always be some instrumentations which for whatever reason are not covered by the semantic conventions whether they have too small of a userbase, are too specific, or provide some custom functionality. This PR seems to prevent those instrumentations from ever being marked as "stable" which will limit their adoption.

@tigrannajaryan I thought there were two options for instrumentation that are not covered by semantic conventions to go stable:

Fixed Schema Telemetry Producers - decide not to evolve the schema via the standard schema mappings

Schema-File Driven Telemetry Producers - publish your own schema file (we have discussed doing this in Java for these one-off instrumentations that do not have a future in the specification repo)

@trask that is correct. And for both options they should also follow this rule:

Stable instrumentations authored by OpenTelemetry SHOULD NOT produce telemetry that is not described by OpenTelemetry semantic conventions.

I think @dyladan's concern is that this is too much limitation and the rules are too rigid for instrumentations contributed by non-core maintainers.

trask · 2022-04-04T21:58:55Z

And for both options they should also follow this rule:

Stable instrumentations authored by OpenTelemetry SHOULD NOT produce telemetry that is not described by OpenTelemetry semantic conventions.

oh, I missed this, thx! I agree this is too limiting for the reason @dyladan mentions:

There will always be some instrumentations which for whatever reason are not covered by the semantic conventions whether they have too small of a userbase, are too specific, or provide some custom functionality

specification/telemetry-stability.md

Summary of changes: - Explain that schemas define what changes are allowed. - Clarify additive changes are an exception and are always allowed. - Prohibit instrumentation changes until July 1, 2022 to allow recipients (backends/vendors) to start understanding Schema URLs/Schema Files.

tigrannajaryan · 2022-04-08T17:51:26Z

I believe so far we have approvals from JS, Go, Java, .Net maintainers.

In addition we now have approvals from Ruby and Python maintainers.

I believe this is good enough and we can move forward with this approach. Merging the PR.

tigrannajaryan requested review from a team November 30, 2021 23:52

github-actions bot assigned yurishkuro Nov 30, 2021

tigrannajaryan force-pushed the semantic-convention-stability branch 3 times, most recently from db79cb2 to 9fb4cf3 Compare December 1, 2021 00:05

tigrannajaryan requested a review from tedsuo December 1, 2021 00:11

pyohannes reviewed Dec 1, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

specification/versioning-and-stability.md Show resolved Hide resolved

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

kenfinnigan reviewed Dec 1, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

tigrannajaryan force-pushed the semantic-convention-stability branch from 16d1b3b to e69ac8d Compare December 1, 2021 15:28

jsuereth previously requested changes Dec 1, 2021

View reviewed changes

tigrannajaryan force-pushed the semantic-convention-stability branch from a1545b5 to ba788d6 Compare December 1, 2021 19:12

tigrannajaryan requested a review from jsuereth December 1, 2021 19:13

jsuereth reviewed Dec 2, 2021

View reviewed changes

specification/versioning-and-stability.md Show resolved Hide resolved

specification/versioning-and-stability.md Show resolved Hide resolved

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

pyohannes reviewed Dec 2, 2021

View reviewed changes

specification/versioning-and-stability.md Show resolved Hide resolved

cijothomas mentioned this pull request Dec 2, 2021

Add schema URL support open-telemetry/opentelemetry-dotnet#2417

Open

tigrannajaryan commented Dec 6, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

jack-berg reviewed Dec 10, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

jack-berg reviewed Dec 10, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

jack-berg reviewed Dec 10, 2021

View reviewed changes

specification/versioning-and-stability.md Outdated Show resolved Hide resolved

tigrannajaryan force-pushed the semantic-convention-stability branch from 923dfd9 to 7bf6381 Compare December 10, 2021 20:03

jmacd approved these changes Dec 10, 2021

View reviewed changes

alanwest reviewed Mar 28, 2022

View reviewed changes

specification/telemetry-stability.md Show resolved Hide resolved

reyang approved these changes Apr 4, 2022

View reviewed changes

alanwest approved these changes Apr 4, 2022

View reviewed changes

dyladan approved these changes Apr 4, 2022

View reviewed changes

carlosalberto approved these changes Apr 5, 2022

View reviewed changes

cijothomas reviewed Apr 5, 2022

View reviewed changes

specification/telemetry-stability.md Show resolved Hide resolved

tigrannajaryan mentioned this pull request Apr 8, 2022

Remove alternative attribute sets from HTTP semantic conventions #2469

Merged

robertlaurin approved these changes Apr 8, 2022

View reviewed changes

fbogsany approved these changes Apr 8, 2022

View reviewed changes

srikanthccv approved these changes Apr 8, 2022

View reviewed changes

lzchen approved these changes Apr 8, 2022

View reviewed changes

tigrannajaryan added 4 commits April 8, 2022 13:48

Fixes based on PR comments

acea642

Change absolute links to relative

8622ec2

Change absolute links to relative

89ead2d

tigrannajaryan force-pushed the semantic-convention-stability branch from 7a7a37b to 89ead2d Compare April 8, 2022 17:48

Fix changelog date

8f664e7

tigrannajaryan merged commit 0ddfacd into open-telemetry:main Apr 8, 2022

tigrannajaryan deleted the semantic-convention-stability branch April 8, 2022 17:52

rauno56 mentioned this pull request May 4, 2022

Prepare instrumentation libraries for stability open-telemetry/opentelemetry-js-contrib#1001

Open

alanwest mentioned this pull request Feb 8, 2024

Why is adding new attributes to a metric a breaking change? open-telemetry/semantic-conventions#722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define semantic conventions and instrumentation stability #2180

Define semantic conventions and instrumentation stability #2180

tigrannajaryan commented Nov 30, 2021 •

edited

Loading

jsuereth left a comment

tigrannajaryan commented Dec 1, 2021 •

edited

Loading

tigrannajaryan commented Dec 1, 2021

jsuereth left a comment

trask commented Dec 10, 2021

tigrannajaryan commented Dec 10, 2021

trask commented Dec 10, 2021

tigrannajaryan commented Dec 10, 2021

trask commented Dec 10, 2021

jmacd left a comment

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

dyladan left a comment

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

trask commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

trask commented Apr 4, 2022

tigrannajaryan commented Apr 8, 2022

Define semantic conventions and instrumentation stability #2180

Define semantic conventions and instrumentation stability #2180

Conversation

tigrannajaryan commented Nov 30, 2021 • edited Loading

jsuereth left a comment

Choose a reason for hiding this comment

tigrannajaryan commented Dec 1, 2021 • edited Loading

tigrannajaryan commented Dec 1, 2021

jsuereth left a comment

Choose a reason for hiding this comment

trask commented Dec 10, 2021

tigrannajaryan commented Dec 10, 2021

trask commented Dec 10, 2021

tigrannajaryan commented Dec 10, 2021

trask commented Dec 10, 2021

jmacd left a comment

Choose a reason for hiding this comment

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

dyladan left a comment

Choose a reason for hiding this comment

tigrannajaryan commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

trask commented Apr 4, 2022

tigrannajaryan commented Apr 4, 2022

trask commented Apr 4, 2022

tigrannajaryan commented Apr 8, 2022

tigrannajaryan commented Nov 30, 2021 •

edited

Loading

tigrannajaryan commented Dec 1, 2021 •

edited

Loading