Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Entity and Resource proposal. #264

Open
wants to merge 51 commits into
base: main
Choose a base branch
from

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Aug 13, 2024

This is a proposal to address Resource and Entity data model interactions, including a path forward to address immediate friction and issues in the current resource specification.

The proposal includes all links and context needed to justify it, but duplicating a snapshot here:

Motivation

This proposal attempts to focus on the following problems within OpenTelemetry to unblock multiple working groups:

  • Allowing mutating attributes to participate in Resource (OTEP 208).
  • Allow Resource to handle entities whose lifetimes don't match the SDK's lifetime (OTEP 208).
  • Provide support for async resource lookup (spec#952).
  • Fix current Resource merge rules in the specification, which most implementations violate (oteps#208, spec#3382, spec#3710).
  • Allow semantic convention resource modeling to progress (spec#605, spec#559, etc).

@jsuereth jsuereth marked this pull request as ready for review August 13, 2024 22:20
@jsuereth jsuereth requested a review from a team August 13, 2024 22:20
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the draft @jsuereth !

This is an important next step for entities. I left some comments, but they don't prevent us from making progress.

text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
@austinlparker
Copy link
Member

A question for @jsuereth (and perhaps this has been addressed in WG meetings, if so, feel free to point me to those discussions) but my read of this proposal is that there is some future state where resource attributes are all contained inside entities, rather than flat, thus changing the 'logical' top-level resource attributes into complex attributes rather than flat ones. This is not a pattern that existing tools really use; Is the intention that these attributes should be flattened on ingest in order to allow traditional group-by/analysis functions and the entity definition ingested as some sort of hash (or other platform-appropriate representation)?

@bboreham
Copy link

Drive-by comment: I expected the top-level description to reference https://github.com/open-telemetry/oteps/blob/main/text/entities/0256-entities-data-model.md

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any prototype implementation of this? It sounds reasonable to me but is pretty abstract.

text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Show resolved Hide resolved
@jsuereth
Copy link
Contributor Author

A question for @jsuereth (and perhaps this has been addressed in WG meetings, if so, feel free to point me to those discussions) but my read of this proposal is that there is some future state where resource attributes are all contained inside entities, rather than flat, thus changing the 'logical' top-level resource attributes into complex attributes rather than flat ones. This is not a pattern that existing tools really use; Is the intention that these attributes should be flattened on ingest in order to allow traditional group-by/analysis functions and the entity definition ingested as some sort of hash (or other platform-appropriate representation)?

The goal of this is that tooling can continue to engage with the flattened resource attribtue model we provide today. optionally tooling can engage with the new bundled entity definitions (and OTEL itself, e.g. the collector, would do so).

This gives flexibility for users to engage or not engage with entities, depending on their need, and opens up further exploration of entities.

@tigrannajaryan
Copy link
Member

Is there any prototype implementation of this? It sounds reasonable to me but is pretty abstract.

@jack-berg There are early prototypes in Go and in Collector (core and contrib). The prototypes somewhat deviate from this proposal (e.g. the prototype only allows one associated EntityRef in the Resource, it uses key/value pairs, not key arrays, etc). Despite some deviation I think the prototypes demonstrate the idea (and we can update the prototypes to align with this proposal).

@aknuds1
Copy link

aknuds1 commented Sep 17, 2024

As proposed - we can create new entity-specific info metrics with descriptive attributes. I'd love to see @aknuds1 confirm, the last time I could check, these info metrics would work with the proposed info PromQL support exactly the way we want them to.

@jsuereth Sorry I'm late in reviewing this, but reading the PR now.

I should perhaps clarify the proposed MVP version of the info PromQL function only works with target_info (by default) or other info metrics (can be specified through a __name__ label matcher) that are identified through the instance and job labels. For info to be more intelligent about which info metrics exist, and which are their identifying labels, I think native metadata support first has to be added to Prometheus (that's something @codesome and I are trying to design).

@jsuereth
Copy link
Contributor Author

jsuereth commented Sep 17, 2024

Edit: adding CNCF thread on this: https://cloud-native.slack.com/archives/C01LSCJBXDZ/p1726584544475279

@aknuds1 - We should discuss more. My assumption on OTLP<->Prometheus here is that we can use identifying attributes to synthesize a job/instance label rather than forcing prometheus to understand all of these other labels.

Then, you can use info to interact with target_info (by default, has all the resource attributes) and/or a possible per-entity info metric.

We should continue this discussion somewhere higher bandwidth, cncf slack otel-prometheus channel?

Copy link

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see question regarding Prometheus compatibility.

text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved

Given our desired design and algorithms for detecting, merging and manipulating Entities, we need the ability to denote how entity and resource relate. These changes must not break existing usage of Resource, therefore:

- The Entity model must be *layered on top of* the Resource model. A system does not need to interact with entities for correct behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the opt-in nature here conflict with the goals of unblocking the pieces you mention at the start? I'm wondering specifically about some components in a pipeline using the entities while others are only using resource. In such a case, I would expect that two entities could appear in the pipeline because now we have two areas of reference for the same concept.

If entiies becomes a new scope, should this eventually seek to replace resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think entities will eventually replace existing resource usage. The reason we're layering is to not break the entire ecosystem and effectively create a V2 of the protocol. Instead, we see this as an evolution. If you want to engage with Entities, you would NEED to ensure all components are entity aware. As soon as one component is not, you're back to today's behavior. So users who want to use entities MUST ensure all components leverage entities.

However - there's a higher level question. Do I need to understand entities to interact with Metrics/Logs/Spans/etc. from OpenTelemetry? The answer to that is no. You can blindly leverage Resource.attributes as important slices of information for the data, and you will always be able to continue doing that. That's addressing consumers / storage solutions.

Copy link

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay reviewing this! It's a great read and I think that a clear separation of identifying/descriptive attributes is a big step forward for better compatibility with Prometheus.

I'd like to propose something different though :)

text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
- If the entity identity is different: drop the new entity `d'`.
- Otherwise, add the entity `d'` to set `E`
- Construct a Resource from the set `E`.
- If all entities within `E` have the same `schema_url`, set the resource's
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the resource attributes? Are they also dependent on the entities set E?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

You construct a Resource using this entity set.

  • The Resource.attributes field contains alL identifying and descriptive attributes from the entity set.
  • The Resource.schema_url field is set to the schema_url shared by all entities (if they all have the same one) or unset if they do not have the same schema_url.
  • The Resource.entities field would contain the EntityRef proto, which just contains key names and the entity type for the entity set.

@carlosalberto
Copy link
Contributor

Overall LGTM, although I'd like to know about details going forward, to set expectations on how the SDKs will be updated:

  • Will Resource stay immutable (with the addition of the new entities field), and will be internally updated by the SDK with the latest instance tracked? If not, will attributes stay as read-only at least, in a best-effort attempt?
  • The merge algorithm mentions schema_url as a comparison point between entities of the same type, and then one of the examples mentions having the same schema_url with different schema version. Is this something to be tuned/clarified later on, regarding on how to merge values in such case?

Copy link

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks (see suggested change)!

text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
text/entities/0264-resource-and-entities.md Outdated Show resolved Hide resolved
Here's a list of requirements for the solution:

- Existing prometheus/OpenTelemetry users should be able to migrate from where they are today.
- Any solution MUST work with the [info-typed metrics](https://github.com/prometheus/proposals/blob/main/proposals/2024-04-10-native-support-for-info-metrics-metadata.md#goals) being added in prometheus.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aknuds1 since the proposal is being referenced in other places, like this OTEP, I wonder if we should update the proposal with what is being implemented here. 🤔

- Resource identifying attributes need more thought/design from OpenTelemetry semconv + entities WG.
- Note: Current `info()` design will only work with `target_info` metric, and `job/instance` labels for joins. This labels MUST be generated by the OTLP endpoint in prometheus.
- (desired) Users should be able to correlate metric timeseries to other signals via Resource attributes showing up as labels in prometheus.
- (desired) Conversion from `OTLP -> prometheus` can be reversed such that `OTLP -> Prometheus -> OTLP` is non-lossy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (desired) Conversion from OTLP -> prometheus can be reversed such that OTLP -> Prometheus -> OTLP is non-lossy.

I'm bringing that up to the Prometheus team. We've recently decided that Prometheus won't support exporting OTLP directly, but we're working on Prometheus Remote Write Receiver in OTel collector and that's our plan to transform Prometheus metrics into OTLP format, would that be ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - to me it's the same domain modelling issue

Copy link

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see this idea moving forward. With clear distinction of identifying attributes I believe we can design a better UX for Prometheus as a OTel metrics backend :)

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
@jsuereth
Copy link
Contributor Author

Thanks @carlosalberto !

Will Resource stay immutable (with the addition of the new entities field), and will be internally updated by the SDK with the latest instance tracked? If not, will attributes stay as read-only at least, in a best-effort attempt?

I think a follow on OTEP will define the needs and design for "mutable" resource. This OTEP calls out that users want descriptive/mutable attributes on Resource. I expect the ResourceProvider concept to be expanded on to provide that (cc @tedsuo who I think is working on that proposal, or updating the existing one).

For now - this proposal has resource remain read-only and defines a merge algorithm for a ResourceProvider, but with identifying "descriptive" (or changing) attributes as a first step. This does not solve the needs of browser until the follow on OTEP is complete and approved.

The merge algorithm mentions schema_url as a comparison point between entities of the same type, and then one of the examples mentions having the same schema_url with different schema version. Is this something to be tuned/clarified later on, regarding on how to merge values in such case?

We clarify schema url versions in the specificaiton already - https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/schemas#schema-url

So we already have a way to understand version and whether or not they match. We do not have any notion of "compatibility" encoded in version but we can check for equality and order them (to know which is latest).

Copy link

@annabokhan annabokhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking, follow ups noted

@jsuereth
Copy link
Contributor Author

Thanks @annabokhan !

Co-authored-by: David Ashpole <dashpole@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.