Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Exemplar support to Metrics proto #159

Merged
merged 5 commits into from
Aug 13, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions opentelemetry/proto/metrics/v1/metrics.proto
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,37 @@ message InstrumentationLibraryMetrics {
repeated Metric metrics = 2;
}

// A representation of raw measurements, which can have statistical meaning based
// on how the measurement was sampled, as well as a span ID and trace ID of the
// active span when the measurement was recorded
message RawValue {
bogdandrutu marked this conversation as resolved.
Show resolved Hide resolved
// The set of labels that were dropped by the aggregator, but recorded
// alongside the original measurement. Only labels that were dropped by the aggregator should be included
repeated opentelemetry.proto.common.v1.StringKeyValue labels = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the relationship between these labels and the labels in the DataPoint:

  • Do we duplicate them?
  • Do we extract these labels and the actual set of labels is the combination of these + datapoint.lables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For exemplars these labels are only the labels not included in the DataPoint's labels. I would change this field to be dropped_labels but if RawValue is used as a data point itself the labels would include all labels

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd I know you tried to share the messages, can you help define the behavior here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you recommended calling these "dropped_labels". This sounds good to me.


// time_unix_nano is the exact time when this RawValue was recorded
//
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
fixed64 time_unix_nano = 2;
bogdandrutu marked this conversation as resolved.
Show resolved Hide resolved

// Numerical value of the measurement that was recorded. Only one of these
// two fields is used for the data, based on MetricDescriptor.measurement_value_type
double double_value = 3;
int64 int64_value = 4;

// (Optional) Span ID of the current trace.
// span_id may be missing if the measurement is not recorded inside a trace or if the trace is not sampled.
bytes span_id = 5;
bogdandrutu marked this conversation as resolved.
Show resolved Hide resolved

// (Optional) Trace ID of the current trace.
// trace_id may be missing if the measurement is not recorded inside a trace or if the trace is not sampled.
bytes trace_id = 6;

// (Optional) When sample_count is non-zero, this exemplar has been chosen in a statistically
// unbiased way such that the exemplar is representative of `sample_count` individual events
double sample_count = 7;
}

// Defines a Metric which has one or more timeseries.
//
// The data model and relation between entities is shown in the
Expand Down Expand Up @@ -352,6 +383,10 @@ message Int64DataPoint {

// value itself.
int64 value = 4;

// (Optional) List of exemplars collected from
// measurements that were used to form the data point
repeated RawValue exemplars = 5;
}

// DoubleDataPoint is a single data point in a timeseries that describes the time-varying
Expand Down Expand Up @@ -382,6 +417,10 @@ message DoubleDataPoint {

// value itself.
double value = 4;

// (Optional) List of exemplars collected from
// measurements that were used to form the data point
repeated RawValue exemplars = 5;
}

// HistogramDataPoint is a single data point in a timeseries that describes the time-varying
Expand Down Expand Up @@ -452,6 +491,10 @@ message HistogramDataPoint {
// If we decide to also support (a, b] intervals we should add support for these by defining
// a boolean value which decides what type of intervals to use.
repeated double explicit_bounds = 7;

// (Optional) List of exemplars collected from
// measurements that were used to form the data point
repeated RawValue exemplars = 8;
cnnradams marked this conversation as resolved.
Show resolved Hide resolved
}

// SummaryDataPoint is a single data point in a timeseries that describes the time-varying
Expand Down Expand Up @@ -508,4 +551,8 @@ message SummaryDataPoint {
// A list of values at different quantiles of the distribution calculated
// from the current snapshot. The quantiles must be strictly increasing.
repeated ValueAtQuantile quantile_values = 6;

// (Optional) List of exemplars collected from
// measurements that were used to form the data point
repeated RawValue exemplars = 7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this just be a list of random samples from the whole window? Open question in the OTEP:

We don’t have a strong grasp on how the sketch aggregator works in terms of implementation - so we don’t have enough information to design how exemplars should work properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proto does not define how the exemplars were sampled, not sure your question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, I see now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the recent OTLP discussion meeting, we agreed to remove the sample_count field from the current proposal. We also agreed to move the Exemplars into the DataPoints so that they can refer to dropped labels, not include full label sets in each point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu does this sound right to you, for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also agreed to move the Exemplars into the DataPoints so that they can refer to dropped labels, not include full label sets in each point.

I think we agreed that you will evaluate what is better for performance/semantics:

  1. Having a repeated RawValue exemplars in the Metric that applies to all data points (user may need to do another remapping to every data point) vs repeated RawValue exemplars in every point (if we go with every point then "dropped_labels" is better name for that).

My point is that I don't have a strong opinion between the both, and was trying to make you investigate and decide which way. Here are my thoughts:

  • Having repeated RawValue exemplars in the Metrics:
    • Pros:
      • Saves some memory in the internal representation (have extra 24 bytes per data point).
      • Same message may be able to be re-used with raw-measurements because labels don't represent dropped.
    • Cons:
      • Duplicate labels on the wire.
      • User needs to re-map every exemplar to the data point by doing the labels matching.

I feel cons are more "significant" than pros, so personally I would go with exemplars in every DataPoint as you suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with exemplars in every DataPoint I would say to rename the message to "Exemplar" :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion makes me want a way to intern label sets to avoid the "re-map every exemplar" problem.

I don't feel inclined to invest time in this now, so we should probably choose "repeated RawValue exemplars in every point".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion makes me want a way to intern label sets to avoid the "re-map every exemplar" problem.

Even with an "intern label" you still need to map every exemplar to a point (that mapping may be faster if we have "intern label" but still needs some work)

}