Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new components to allow for generating metrics from 100% of spans without impacting sampling #802

Merged
merged 2 commits into from
May 15, 2023

Conversation

thpierce
Copy link
Contributor

@thpierce thpierce commented Mar 30, 2023

Description:

We wish to generate key metrics from OTEL auto-instrumented spans. I had previously engaged the community to discuss this, and after some conversations, decided that the best approach would be to implement a Sampler and a Span Processor to the contrib repositories.

The major requirements for our project are as follows:

  • Metrics should be produced from 100% of Client/Server/Producer/Consumer spans produced by OTEL Auto-Instrumentation, regardless of sampling decision
    • Span sampling decisions should still be respected w.r.t. exporting spans to the collector and propagating sampling decisions to child spans
  • Metric generation must be done when the span is ended, as metric attributes are derived from some span attributes that are not always available when the span is started (e.g. http.route).
  • Metric attributes need to be added to the span attributes before export to Collector, to be used to correlate metrics and spans in a downstream process.

To meet these goals, we have developed components which we wish to contribute to the OTEL community:

  • AlwaysRecordSampler.
    • This is an aggregate Sampler that is initialized with a root Sampler (akin to ParentBasedSampler).
    • shouldSample() will call the root sampler and, if the SampleDecision is RECORD_AND_SAMPLE or RECORD_ONLY, it simply returns the result. However, if the SampleDecision is DROP, it will instead create a new result with SampleDecision set to RECORD_ONLY (all other result behaviour unchanged).
    • This Sampler will ensure that 100% of spans get send to the SpanProcessor, without changing the actual Sampling rate/behaviour.
  • MetricAttributeGenerator
    • A generic interface for components that consume spans and resources and export attributes for metrics generated from these spans.
    • Implemented by AwsMetricAttributeGenerator to produce metric attributes specific to our usecase.
  • AwsSpanMetricsProcessor.
    • This SpanProcessor will create desired metrics and embed attributes into processed spans before sending them to the downstream SpanProcessor (for additional processing/export).
    • The AwsSpanMetricsProcessor is initialized (via AwsSpanMetricsProcessorBuilder) with metering instruments, a MetricAttributeGenerator and a resource.
    • All APIs are uninteresting except onEnd(), which will:
      • Determine attributes for metrics (using the provided MetricAttributeGenerator)
      • Emit metrics with attributes (with the provided metering instruments)
  • AwsMetricAttributesSpanExporter.
    • This SpanExporter will embed desired metric attributes into processed spans before sending them to the downstream SpanExporter for export.
    • The AwsMetricAttributesSpanExporter is initialized (via AwsMetricAttributesSpanExporterBuilder) with a MetricAttributeGenerator, a Resource, and a delegate SpanExporter.
    • All APIs are uninteresting except export(), which will:
      • Determine attributes to embed(using the provided MetricAttributeGenerator)
      • Replace the span with a new SpanData identical to the original span, but with metric attributes embedded into span attributes.
      • Pass the spans to the delegate SpanExporter's export method.

Existing Issue(s):

#789

Testing:

  • Unit tests added and run successfully
  • ./gradlew assemble Passes
  • ./gradlew spotlessCheck Passes
  • E2E testing done to verify emitted metrics align with expectations.

Documentation:

These are independent components that will require configuration in the TracerProvider. We will create a document to describe the usage once it is ready for distribution.

Outstanding items:

  • We may change in the details of the metric attributes created by AwsMetricAttributeGenerator in future, but for now these are sufficient.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 30, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: bjrara / name: Mengyi Zhou (bjrara) (241d1ab)
  • ✅ login: thpierce / name: Thomas Pierce (6a49d3f)

@thpierce thpierce marked this pull request as ready for review April 28, 2023 16:47
@thpierce thpierce requested review from a team and srprash April 28, 2023 16:47
@thpierce thpierce marked this pull request as draft May 2, 2023 21:22
… without impacting sampling

In this commit, we are adding several new components to the awsxray package:
* AlwaysRecordSampler - A simple aggregate sampler that always ensures spans are "recorded" (i.e. sent to span processors). This allows us to generate metrics from 100% of spans without impacting true sampling rate.
* AwsSpanMetricsProcessor - A span processor that will generate specific metrics pertaining to latency, faults, and errors. Relies on a MetricAttributeGenerator to build attributes for the metrics.
* AwsMetricAttributesSpanExporter - A span exporter that relies on MetricAttributeGenerator to wraps metric attributes around the span attributes before passing the span to a delegate span exporter for export.
* MetricAttributeGenerator - A generic interface for components that consumes spans and resources and produces attributes for metrics generated from these spans.
* AwsMetricAttributeGenerator - A specific implementation of MetricAttributeGenerator, used for generating AWS-specific attributes.
* AwsSpanMetricsProcessorBuilder - A builder class for AwsSpanMetricsProcessor.
* AwsMetricAttributesSpanExporterBuilder - A builder class for AwsMetricAttributesSpanExporter.

Related issue: open-telemetry#789
@thpierce thpierce marked this pull request as ready for review May 10, 2023 21:55
Copy link

@srprash srprash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of small comments. Otherwise LGTM.

@thpierce
Copy link
Contributor Author

@trask - PR is ready for your review.

Comment on lines +181 to +182
if (isKeyPresent(span, PEER_SERVICE) && !isKeyPresent(span, AWS_REMOTE_SERVICE)) {
setRemoteService(span, builder, PEER_SERVICE);
Copy link
Member

@mxiamxia mxiamxia May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a callout but not a blocker - is it expected that we have added the log entry by logUnknownAttribute(AWS_REMOTE_SERVICE, span); but the RemoteService is actually set properly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I see what you mean, it's a little bit odd. I think for now it's ok. We plan to revise a bit of this logic in future, and it's not totally unreasonable for the time being. We don't expect PEER_SERVICE to be frequently applied. Will make a note for the future.

Copy link
Member

@mxiamxia mxiamxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ty!

Copy link
Member

@mxiamxia mxiamxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ty!

Copy link
Member

@mxiamxia mxiamxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ty!

@mxiamxia
Copy link
Member

Hi @trask , just kindly check if you can help to take a look this PR.

@trask trask merged commit fbf8304 into open-telemetry:main May 15, 2023
dmitryax pushed a commit to open-telemetry/opentelemetry-collector-contrib that referenced this pull request May 31, 2023
… as Segment name (#22835)

XRayExporter will be extended to support the new AWS specific service name attributes changed in open-telemetry/opentelemetry-java-contrib#802 for X-Ray trace segment name.

The new Span attribute aws.local.service will be set as Server kind segment name, and aws.remote.service attribute will set as Client kind subsegment name if the attribute exists.
lisguo pushed a commit to lisguo/opentelemetry-collector-contrib that referenced this pull request Jun 19, 2023
… as Segment name (open-telemetry#22835)

XRayExporter will be extended to support the new AWS specific service name attributes changed in open-telemetry/opentelemetry-java-contrib#802 for X-Ray trace segment name.

The new Span attribute aws.local.service will be set as Server kind segment name, and aws.remote.service attribute will set as Client kind subsegment name if the attribute exists.
Caleb-Hurshman pushed a commit to observIQ/opentelemetry-collector-contrib that referenced this pull request Jul 6, 2023
… as Segment name (open-telemetry#22835)

XRayExporter will be extended to support the new AWS specific service name attributes changed in open-telemetry/opentelemetry-java-contrib#802 for X-Ray trace segment name.

The new Span attribute aws.local.service will be set as Server kind segment name, and aws.remote.service attribute will set as Client kind subsegment name if the attribute exists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants