Add jaeger option to split oversized batches #1475

janLo · 2021-02-26T17:33:50Z

This adds OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES to
the Jaeger exporter as porposed in
open-telemetry/opentelemetry-python#1500

Fixes #

Changes

Please provide a brief description of the changes here. Update the
CHANGELOG.md for non-trivial changes. If CHANGELOG.md is updated,
also be sure to update spec-compliance-matrix.md if necessary.

Related issues #

Related oteps #

This adds OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES to the Jaeger exporter as porposed in open-telemetry/opentelemetry-python#1500

bogdandrutu · 2021-02-26T18:58:06Z

specification/sdk-environment-variables.md

@@ -98,6 +98,7 @@ See [OpenTelemetry Protocol Exporter Configuration Options](./protocol/exporter.
 | OTEL_EXPORTER_JAEGER_ENDPOINT   | HTTP endpoint for Jaeger traces                   | <!-- markdown-link-check-disable --> "http://localhost:14250"<!-- markdown-link-check-enable --> |
 | OTEL_EXPORTER_JAEGER_USER       | Username to be used for HTTP basic authentication | -                                                                                                |
 | OTEL_EXPORTER_JAEGER_PASSWORD   | Password to be used for HTTP basic authentication | -                                                                                                |
+| OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES  | Split jaeger exporter UDP batches if they exceed the UDP packet limit | false                                                                                                |


Why should this be an environment variable? Can you first define this in the jaeger exporter document? https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk_exporters/jaeger.md

You should explain this configuration option then for that config option we can add an env variable.

Split jaeger exporter UDP batches if they exceed the UDP packet limit

This doesn't seem like it should be configurable. If it's set to false, what's the solution?

In Jaeger own SDKs, there was no batch size parameter (except for Python, but for different technical reasons), there is only packet size, and the exporter always flushes appropriately sized packets (since exceeding the size means the agent will reject them).

I left it configureable because it might have a performance impact to further-fragment the batches. I can life without it being configureable but this should be decided by the maintainers of the python-sdk.

I suppose this is a conflict between batching done above the exporter and batching done by the exporter itself. Shared/reusable batching (by # of spans) makes sense only for transports that do not impose strict message size limits. In case of UDP transport with fixed max message size the batching should be really in the transport itself, not above it with different size semantics.

The change is merged in the python implementation now.

@janLo Could you please sum up the reasons brought up by the Python maintainers for making it configurable? This could help in convincing @yurishkuro. I glanced over the issue you linked above (open-telemetry/opentelemetry-python#1500) but couldn't find an explanation.

If we can't find an agreement to put this in the spec, it would still be fine to keep this as a Python-specific peculiarity if Python maintainers want to keep it this way nevertheless. In this case, it should be prefixed with PY(THON) as specified here, however: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/sdk-environment-variables.md#language-specific-environment-variables

The thing that I need to be convinced of is why there is ever a need to NOT split the batches, since this will simply result in too large UDP packets that will be discarded by the server.

It has performance implications to split the packets. One can argue that ending up with too large batches is a configuration error in the first place and we do not want to spend time to send two packets out if we originally planned to send one.

In the end I'm really indifferent to the option being configurable. If you think it's better to be the default I can provide another PR as soon as I find the time. I ran into this b/c I use open telemetry for a e2e test system where I have quite large traces with a lot of metadata. I don't know if other usecases trigger the issue solved with this option at all - and as said, it does add complexity and probably performance implications on the way.

So performance implications of splitting batches vs. performance implications of doing all the work of collecting them and then dropping them on the floor because the agent can't process them. Seems like an easy choice :-)

github-actions · 2021-03-06T03:29:57Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

janLo · 2021-03-06T08:52:37Z

What can I do to unmark it as stale?

github-actions · 2021-03-14T03:31:02Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

arminru · 2021-03-15T15:04:16Z

What can I do to unmark it as stale?

@janLo Please take a look at the unresolved conversation above and find an agreement there. Thanks!

github-actions · 2021-03-23T03:33:40Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

arminru · 2021-03-23T10:01:57Z

Ping @janLo.
Please address the concerns raised above, i.e., reconsidering if this should be configurable at all and if so, introducing the option generally in the Jaeger exporter spec before adding an env var for it. Thanks!

janLo · 2021-03-23T10:06:39Z

Hello @arminru I think the behavior should be configurable as it has a potential performance impact. Can you point me to the spec you mean? I thought this is the exporter spec.
I'm a bit lost on how to proceed. I needed the functionality, implemented it, got asked to add documentation here and now? The change is already merged.

arminru · 2021-03-23T16:06:23Z

@janLo I'm referring to this comment in particular: #1475 (comment)
In the jaeger.md file you would introduce the config option and define its behavior, and then in the file sdk-environment-variables.md you would just add the environment variable for setting it.
Also, please add entries in CHANGELOG.md and spec-compliance-matrix.md for this change (as mentioned in the PR template).
Thanks!

janLo · 2021-03-24T08:57:39Z

@arminru I've just checked the jaeger.md content. I do not thing that there is anything to add there as it does not affect the transformation at all. It only affects the chunk-size (as in count of spans) of batched span transmission which is dynamically adjusted if the new setting is enabled. The data transmitted to jaeger is exactly the same. In fact the translation happens before the batching takes place.

As for the spec-compliance-matrix.md I cannot identify the right place to put this as it does not speak of batching options at all.

arminru · 2021-03-25T09:42:54Z

Whoops, accidentally hit the close button - sorry!

@janLo

@arminru I've just checked the jaeger.md content. I do not thing that there is anything to add there as it does not affect the transformation at all. It only affects the chunk-size (as in count of spans) of batched span transmission which is dynamically adjusted if the new setting is enabled. The data transmitted to jaeger is exactly the same. In fact the translation happens before the batching takes place.

I'd suggest adding a new heading "Configuration options" after "Summary" and "Mappings" that would explain the config option. I think this is what @bogdandrutu was asking for, right?

As for the spec-compliance-matrix.md I cannot identify the right place to put this as it does not speak of batching options at all.

The env var would go into https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md#environment-variables.

github-actions · 2021-04-04T03:19:30Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2021-04-12T03:19:38Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Add jaeger option to split oversized batches

2c8991c

This adds OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES to the Jaeger exporter as porposed in open-telemetry/opentelemetry-python#1500

janLo requested review from a team February 26, 2021 17:33

janLo mentioned this pull request Feb 26, 2021

Add possibility to split oversized udp batches open-telemetry/opentelemetry-python#1500

Merged

10 tasks

github-actions bot assigned arminru Feb 26, 2021

bogdandrutu reviewed Feb 26, 2021

View reviewed changes

github-actions bot added the Stale label Mar 6, 2021

yurishkuro removed the Stale label Mar 6, 2021

github-actions bot added the Stale label Mar 14, 2021

carlosalberto removed the Stale label Mar 15, 2021

github-actions bot added the Stale label Mar 23, 2021

github-actions bot removed the Stale label Mar 24, 2021

arminru closed this Mar 25, 2021

arminru reopened this Mar 25, 2021

github-actions bot added the Stale label Apr 4, 2021

github-actions bot closed this Apr 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add jaeger option to split oversized batches #1475

Add jaeger option to split oversized batches #1475

janLo commented Feb 26, 2021

bogdandrutu Feb 26, 2021

yurishkuro Feb 26, 2021

janLo Feb 26, 2021

yurishkuro Feb 26, 2021

janLo Mar 15, 2021

arminru Mar 25, 2021

yurishkuro Mar 25, 2021

janLo Mar 26, 2021

yurishkuro Mar 27, 2021

github-actions bot commented Mar 6, 2021

janLo commented Mar 6, 2021

github-actions bot commented Mar 14, 2021

arminru commented Mar 15, 2021

github-actions bot commented Mar 23, 2021

arminru commented Mar 23, 2021

janLo commented Mar 23, 2021

arminru commented Mar 23, 2021

janLo commented Mar 24, 2021

arminru commented Mar 25, 2021

github-actions bot commented Apr 4, 2021

github-actions bot commented Apr 12, 2021

Add jaeger option to split oversized batches #1475

Add jaeger option to split oversized batches #1475

Conversation

janLo commented Feb 26, 2021

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 6, 2021

janLo commented Mar 6, 2021

github-actions bot commented Mar 14, 2021

arminru commented Mar 15, 2021

github-actions bot commented Mar 23, 2021

arminru commented Mar 23, 2021

janLo commented Mar 23, 2021

arminru commented Mar 23, 2021

janLo commented Mar 24, 2021

arminru commented Mar 25, 2021

github-actions bot commented Apr 4, 2021

github-actions bot commented Apr 12, 2021