Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define guidelines for messaging about excessive amounts of recorded data #1163

Open
jtmalinowski opened this issue Jul 2, 2021 · 5 comments
Assignees

Comments

@jtmalinowski
Copy link
Contributor

What are you trying to achieve?

Specify general guidelines for behaviour of SDK libraries in cases when there's excessive/abnormal amount of telemetry data being produced (e.g. a faulty loop produces too many / too long attributes).

2 decisions to be made:

  • how to inform users that the data truncation was performed (or even if we should do that at all)
  • where in the specification should we put these guidelines

Additional context.

There seem to be a couple of conflicting opinions about how to inform users about data truncation, especially in comments on open-telemetry/opentelemetry-specification#1130.

@tigrannajaryan
Copy link
Member

@jtmalinowski since you have a related PR open do you mind if I assign this issue to you?

@jtmalinowski
Copy link
Contributor Author

@tigrannajaryan please do

@jtmalinowski jtmalinowski changed the title Define guidelines for handling excessive amounts of recorded data Define guidelines for messaging about excessive amounts of recorded data Jul 6, 2021
@jtmalinowski
Copy link
Contributor Author

I'm sure @jmacd and @jkwatson had comments on this, and I may be remembering it wrong but @MrAlias I think said something too.

Comments from RUM (browser, android, ios) perspective would certainly be very useful too: @obecny (unless you have no opinion or would like to delegate this to someone else), @ivomagi, @johnbley, @alolita.

If you have no opinion on this subject, please thumbs down this comment, so I know you saw it, thanks!

@obecny
Copy link
Member

obecny commented Jul 8, 2021

Ideally we could inform user:

  1. First error occurence - inform imemdiately
  2. 2nd and so on, collect number of errors, and inform user after 20s for example with showing the error and number of times it was catched
    For above we could use our own counter metric unless there is an easier way of doing that

@alevenberg
Copy link
Contributor

I went to the messaging WG today to discuss what happens when we reach link limits.

There is a default link limit for spans (128), but the GCP Pub/Sub library can publish batches of up to 1000 messages. To work around this, we're using the environment variable to set the link limit. If the user has it set to the default or a lower limit, we capture all of the links by creating "publish #" spans which don't represent new work but are only created to hold all the links. See below the example screenshot

Screenshot 2023-11-16 at 11 58 57 AM

When we discussed today, @lmolkova raised two options for solving this problem:

  1. Users can increase the # of links by specifying a larger limit.
  2. Sample the links and should preferably stay

Everyone seems to prefer option 1.

@austinlparker austinlparker transferred this issue from open-telemetry/opentelemetry-specification Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Post-stability
Development

No branches or pull requests

5 participants