From c819f1795ebc23cf6da42481eccd320b9daab4ce Mon Sep 17 00:00:00 2001 From: Bogdan Drutu Date: Mon, 18 May 2020 16:09:55 -0700 Subject: [PATCH] Add requirements for probability sampler Signed-off-by: Bogdan Drutu --- specification/trace/sdk.md | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/specification/trace/sdk.md b/specification/trace/sdk.md index 9531c50bd3f..a4290c77405 100644 --- a/specification/trace/sdk.md +++ b/specification/trace/sdk.md @@ -90,7 +90,7 @@ It produces an output called `SamplingResult` which contains: Returns the sampler name or short description with the configuration. This may be displayed on debug pages or in the logs. Example: -`"ProbabilitySampler{0.000100}"`. +`"TraceIdRatioBased{0.000100}"`. Description MUST NOT change over time and caller can cache the returned value. @@ -109,16 +109,29 @@ The default sampler is `ParentBased(root=AlwaysOn)`. * Returns `NOT_RECORD` always. * Description MUST be `AlwaysOffSampler`. -#### Probability +#### TraceIdRatioBased -* The `ProbabilitySampler` MUST ignore the parent. To respect the parent -`SampledFlag`, the `ProbabilitySampler` should be used as a delegate of the -`ParentBased` sampler specified below. -* Description MUST be `ProbabilitySampler{0.000100}`. +* The `TraceIdRatioBased` MUST ignore the parent `SampledFlag`. To respect the +parent `SampledFlag`, the `TraceIdRatioBased` should be used as a delegate of +the `ParentBased` sampler specified below. +* Description MUST be `TraceIdRatioBased{0.000100}`. -TODO: Add details about how the `ProbabilitySampler` is implemented as a function +TODO: Add details about how the `TraceIdRatioBased` is implemented as a function of the `TraceID`. +##### Requirements for `TraceIdRatioBased` sampler algorithm + +* The sampling algorithm MUST be deterministic. A trace identified by a given +`TraceId` is sampled or not independent of language, time, etc. To achieve this, +implementations MUST use a deterministic hash of the `TraceId` when computing +the sampling decision. By ensuring this, running the sampler on any child `Span` +will produce the same decision. +* A `TraceIdRatioBased` sampler with a given sampling rate MUST also sample all +traces that any `TraceIdRatioBased` sampler with a lower sampling rate would +sample. This is important when a backend system may want to run with a higher +sampling rate than the frontend system, this way all frontend traces will +still be sampled and extra traces will be sampled on the backend only. + #### ParentBased * This is a composite sampler. `ParentBased` helps distinguished between the