You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per the discussion in pollinators #general, I'm opening an issue to track a feature request that would be valuable across beelines. I'm just more familiar with the Ruby beeline, so I'm opening it here.
Problem Statement
With deterministic sampling, you're generally either sending an entire trace to Honeycomb or no events at all.[1] So, to investigate usage details and fine-tune your sampling, it's helpful to know how big your traces usually are.
You could figure out the size of traces by writing a query such as COUNT_DISTINCT(trace.span_id) GROUP BY trace.trace_id. But because you can't then query those results (like a nested query or a HAVING clause), you can't do more sophisticated things such as generating a heatmap & using Bubble Up to identify traffic patterns that lead to big traces.
So, to get a better look at the trace sizes, we'd need a queryable field like trace.size. This would be the number of events that share the same trace id - the cardinality of the whole tree, not just (say) the number of direct children.
Proof of Concept
Conceptually, every span that gets generated would increment the trace size. The simplest proof of concept could use the existing rollup fields feature:
But the way rollup fields work, this would give every non-root span a trace.size of 1. Then you'd have "gotcha" queries where you need to remember to specify WHERE trace.parent_id does-not-exist.
Still, you could easily imagine manually incrementing a counter on the trace as spans get generated, then dropping an add_field "trace.size", trace.size if root? in Honeycomb::Span#add_additional_fields.
Concerns
Consistency across beelines: It isn't ideal to add a new "Honeycomb-owned" field like trace.size if it's only available in one language's beeline. You'd also want beelines' implementations to agree with each other with respect to the other concerns noted below.
Distributed tracing: Upstream services could propagate their running trace size to downstream services. But how does the downstream service propagate its subtree trace size back up to the upstream's root? Doesn't seem possible to me with how distributed tracing works right now.[2]
Sampling: The way I threw the rollup field into the initializer before doesn't account for whether we drop the span.[3] We could just as well increment the count right before we send a presampled event. But would we want to count the size irrespective of sampling or the size as actually sent?
Subtrees: Does the trace.size only apply to the root? Or would it be useful to track every subtree's size? This wouldn't be too hard to implement, but it'd probably makes queries more finicky. I could see an argument for slicing & dicing, though. "How big are my traces nested under operation xyz?"
Span events & links: These aren't currently supported by the Ruby beeline (cf. Span Events #66 & Links #68), but they do increment your event count. If/when they do get supported, they should go towards the trace size calculation.
Footnotes
This isn't quite true, since you could set different sample rates for different events in one trace. E.g., this happens in the forem/forem sampler discussed in a recent HoneyByte. ⤴️
It's kind of interesting to consider that distributed tracing headers work unidirectionally: upstream propagates to downstream. I wonder what other functionality a bidirectional protocol could open up? ⤴️
Depending on the sample hook's implementation, we actually needn't necessarily send every span of a trace. Even with deterministic sampling, we could have a case like footnote 1. Moreover, the sample hook is under no obligation to use the deterministic sampler. ⤴️
The text was updated successfully, but these errors were encountered:
We will be closing this issue as it is a low priority for us. It is unlikely that we'll ever get to it, and so we'd like to set expectations accordingly.
As we enter 2022 Q1, we are trimming our OSS backlog. This is so that we can focus better on areas that are more aligned with the OpenTelemetry-focused direction of telemetry ingest for Honeycomb.
If this issue is important to you, please feel free to ping here and we can discuss/re-open.
Per the discussion in pollinators #general, I'm opening an issue to track a feature request that would be valuable across beelines. I'm just more familiar with the Ruby beeline, so I'm opening it here.
Problem Statement
With deterministic sampling, you're generally either sending an entire trace to Honeycomb or no events at all.[1] So, to investigate usage details and fine-tune your sampling, it's helpful to know how big your traces usually are.
You could figure out the size of traces by writing a query such as
COUNT_DISTINCT(trace.span_id) GROUP BY trace.trace_id
. But because you can't then query those results (like a nested query or aHAVING
clause), you can't do more sophisticated things such as generating a heatmap & using Bubble Up to identify traffic patterns that lead to big traces.So, to get a better look at the trace sizes, we'd need a queryable field like
trace.size
. This would be the number of events that share the same trace id - the cardinality of the whole tree, not just (say) the number of direct children.Proof of Concept
Conceptually, every span that gets generated would increment the trace size. The simplest proof of concept could use the existing rollup fields feature:
Or, as a monkey-patch (for those those who might want to play with it in their own code despite the hackiness):
But the way rollup fields work, this would give every non-root span a
trace.size
of 1. Then you'd have "gotcha" queries where you need to remember to specifyWHERE trace.parent_id does-not-exist
.Still, you could easily imagine manually incrementing a counter on the trace as spans get generated, then dropping an
add_field "trace.size", trace.size if root?
inHoneycomb::Span#add_additional_fields
.Concerns
trace.size
if it's only available in one language's beeline. You'd also want beelines' implementations to agree with each other with respect to the other concerns noted below.trace.size
only apply to the root? Or would it be useful to track every subtree's size? This wouldn't be too hard to implement, but it'd probably makes queries more finicky. I could see an argument for slicing & dicing, though. "How big are my traces nested under operation xyz?"Footnotes
This isn't quite true, since you could set different sample rates for different events in one trace. E.g., this happens in the forem/forem sampler discussed in a recent HoneyByte.⤴️
It's kind of interesting to consider that distributed tracing headers work unidirectionally: upstream propagates to downstream. I wonder what other functionality a bidirectional protocol could open up?⤴️
Depending on the sample hook's implementation, we actually needn't necessarily send every span of a trace. Even with deterministic sampling, we could have a case like footnote 1. Moreover, the sample hook is under no obligation to use the deterministic sampler.⤴️
The text was updated successfully, but these errors were encountered: