Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enh] Make max_accumulations configurable and add remove method for meters. #2747

Closed
wants to merge 2 commits into from

Conversation

tjiuming
Copy link

Make max_accumulations configurable and add remove method for meters.

Related PR: open-telemetry/opentelemetry-java#4647

@@ -300,6 +301,13 @@ instrument. It MUST be treated as an opaque string from the API and SDK.
* It MUST support at least 1023 characters. [OpenTelemetry
API](../overview.md#api) authors MAY decide if they want to support more.

#### Instrument max_accumulations

The `max_accumulations` is an optional free-form integer provided by the author of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "accumulation"? This seems like an implementation detail from Java?

Copy link
Author

@tjiuming tjiuming Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like Counter.Child in Prometheus. Currently, OTEL SDK limits the maximum number of Child to 2000, it should be configurable or Unlimited

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont see anything like "Counter.Child" in OTel specs. Most likely this is something very java specific.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in Otel specs, the number of child in unlimited? accumulation's limits is just a java spec?

how about remove method? should it be added to Otel spec?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cijothomas sorry, I've read the open-telemetry/opentelemetry-java#4647 's comments carefully, max_accumulations is truly a java spec, please ignore about it.

besides, I wonder that if remove method should be added to Otel spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accumulation is a term internal to java's implementation. The max accumulations refers to the number of distinct attribute sets / timeseries that are allowed in memory for a particular instrument at any time. If exceeded, we log a warning and drop the measurement.

The topic of metrics SDK memory limits was punted on for the initial stable release of the metrics spec, and in java (and I believe dotnet) we opted for defensive approach with a fixed limit until the spec clears things up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! Yes .NET also uses its own mechanism, called SetMaxMetricPointsPerMetricStream.
https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/metrics/customizing-the-sdk#changing-maximum-metricpoints-per-metricstream

Would be good to have it in spec. (allowing sufficient flexibility so that we wont break existing API in .NET)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-berg @cijothomas If I understand you correctly, it's good to add max_accumulations and remove to spec right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if there is a need to have remove.
I definitely like spec to have the equivalent of max_accumulations. (need a diff. name, as accumulation is java specific.) .NET named it SetMaxMetricPointsperMetricStream

@jmacd
Copy link
Contributor

jmacd commented Aug 24, 2022

I believe this is the same as #1891

@jmacd
Copy link
Contributor

jmacd commented Aug 24, 2022

OpenTelemetry has resisted adding a remove() or delete() method; I personally do not feel it is necessary. We are able to reset start timestamps and eject streams from memory, we just need to clearly write the specification.

@tjiuming
Copy link
Author

OpenTelemetry has resisted adding a remove() or delete() method; I personally do not feel it is necessary. We are able to reset start timestamps and eject streams from memory, we just need to clearly write the specification.

if there are a meter named broker_published_message_count to record each topic's incoming messages, we often code like this:

Counter counter = ...
counter.add({namespace = 'xxx', topic = 'xxx'}, 1)

after the topic deleted, I believe we need to discard the accumulation directly, not through other indirect ways:

counter.remove({namespace = 'xxx', topic = 'xxx'})

@tjiuming tjiuming requested review from cijothomas and jack-berg and removed request for cijothomas August 25, 2022 05:26
@jmacd
Copy link
Contributor

jmacd commented Aug 25, 2022

@tjiuming thanks for explaining. We should discuss these two problems separately. For the limit question, please use #1891.

Although I understand the precedent set in Prometheus and implied meaning in OpenMetrics when data is removed from a metrics pipeline, I would like OpenTelemetry to ensure correctness. In the example you gave, it is difficult to know when it is safe to remove the metric; we might for example specify that data must be reported before it can be removed, which leads to fairly difficult questions. After you remove the data, how should the data be safely removed from memory? For this question, please see #2711.

Lastly, I wonder if you've considered using delta temporality to work around this problem? The intention behind delta temporality is that any synchronous-instrument-timeseries that you stop using can be erased from memory after it is exported. If you configured your SDK to export to an OTel collector using delta temporality, the collector could be configured to address this problem. The problem still has to be solved (i.e., there's a limit on memory) but at least it's outside your SDK.

@github-actions
Copy link

github-actions bot commented Sep 2, 2022

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 2, 2022
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants