Skip to content
This repository has been archived by the owner on Dec 20, 2021. It is now read-only.

Project is being sunset #36

Closed
vreynolds opened this issue Aug 13, 2021 · 18 comments
Closed

Project is being sunset #36

vreynolds opened this issue Aug 13, 2021 · 18 comments
Labels
type: discussion Requests for comments, discussions about possible enhancements.

Comments

@vreynolds
Copy link
Contributor

See OSS lifecycle and practices

This project has outlived its useful lifespan, and will be archived in the future. Honeycomb supports OTLP ingest directly, which renders this exporter obsolete.

If you are actively using this exporter, please comment on this issue with your use case.

@vreynolds vreynolds pinned this issue Aug 13, 2021
@vreynolds vreynolds added the type: discussion Requests for comments, discussions about possible enhancements. label Aug 13, 2021
@ajbouh
Copy link

ajbouh commented Aug 14, 2021

From what I can tell, the OLTP ingest assumes that all the spans should go into the same dataset.

This is not what I want. I'd like to use a single CloudWatch subscription to upload all the events generated by my AWS Lambda functions, but I don't want all my Lambda functions to be emitting events to the same dataset.

@plinehan
Copy link

Our cluster's egress proxy doesn't support gRPC / HTTP/2 so we've been using this exporter instead.

@MikeGoldsmith
Copy link
Contributor

MikeGoldsmith commented Aug 17, 2021

From what I can tell, the OLTP ingest assumes that all the spans should go into the same dataset.

This is not what I want. I'd like to use a single CloudWatch subscription to upload all the events generated by my AWS Lambda functions, but I don't want all my Lambda functions to be emitting events to the same dataset.

Hi @ajbouh
You're correct a single OTLP exporter can only export to a single dataset as the dataset is set in the metadata headers. However, this exporter only allows a single dataset to be set via constructor parameter or environment variable.

How are you separating your traffic into different datasets now?

@MikeGoldsmith
Copy link
Contributor

Our cluster's egress proxy doesn't support gRPC / HTTP/2 so we've been using this exporter instead.

Hi @plinehan
You're correct the current version of the OTLP exporter doesn't support HTTP. However, this PR was recently merged to add support for OTLP over HTTP and will be in the next release.

@plinehan
Copy link

Sick! Thanks @MikeGoldsmith!

Looking forward to using that instead.

@ajbouh
Copy link

ajbouh commented Aug 18, 2021 via email

@MikeGoldsmith
Copy link
Contributor

I am using a slightly modified version of this exporter that adds the dataset and prints it to stdout

@ajbouh can you share what changes you've made to get this to work for you? Did you make the changes in a fork?

@MikeGoldsmith
Copy link
Contributor

MikeGoldsmith commented Aug 18, 2021

Also, @ajbouh - we have alternative ways to export AWS CloudWatch logs.

@ajbouh
Copy link

ajbouh commented Aug 18, 2021

I'm using the agentless cloudwatch integration to relay events that are printed to stdout in my lambda.

But I still need an exporter to print events out in the proper format.

@ajbouh
Copy link

ajbouh commented Aug 18, 2021

I opened up a separate issue describing the changes I needed to make to get this exporter working with the agentless CloudWatch integration: #39

@JamieDanielson
Copy link
Contributor

Hi @ajbouh,

We have heard from others as well about a need to route data to different datasets. We understand the request for utilizing multiple datasets, though as you know this project does not handle that. If this is a change we are able to make, it will not be in this project and will be changed elsewhere. If you have a working fork that achieves what you need today, that is probably the best thing to hold onto for now as we do not intend to make any further changes to this repo.

Out of curiosity, could you explain more about your use case to help us better understand how we may consider these changes in a different context?

@ajbouh
Copy link

ajbouh commented Aug 19, 2021

Understood, though I believe this project is the canonical source of the mapping between the opentelemetry span schema and the honeycomb event schema.

That mapping is an important one to make sure is correct and up to date. Unofficial forks run some risk of being wrong and/or falling behind.

@JamieDanielson
Copy link
Contributor

The canonical mapping is part of the API now as part of our OTLP ingest. Additional information that may be helpful for clarification can be found in our docs for Sending Trace Data. This does bring up a point that we may want to consider more specifically outlining schema mapping in our documentation. This project is just an implementation of that schema mapping though, not a source.

@ajbouh
Copy link

ajbouh commented Aug 20, 2021

Beyond what's in those docs, this repository also seems to populate:

  • response.status_code (I believe the opentelemetry spec uses http.status_code)
  • status.message

Are these fields missing from the current docs?

@vreynolds vreynolds self-assigned this Aug 23, 2021
@JamieDanielson
Copy link
Contributor

It sounds like there may be some confusion about the flexibility around what Honeycomb will ingest and include in your data. Honeycomb will take in any fields that are sent, even those that don’t exist in any spec but that are useful to you. For example, you can have a field “foo” with a value of “bar” and that will show up in your raw data and can be queried against.

Regarding the specific fields you mention, status_code and status.message… In your dataset settings, you can set definitions if you want something that is named differently than what is being sent in. Default field names are noted in Configuring Home.

This exporter specifically defines the response.status_code field not as http.status_code but as the span status_code from the trace proto definition. The status_message is the status description, which is used when there is an error status code if a message is provided by the instrumentation library. Honeycomb will attempt to update the dataset definition based on what it receives... but it can be called anything and set as the Field name in the Dataset settings. http-status_code is a semantic convention used by instrumentation libraries and users and will be passed through in OTLP ingest.

@ajbouh
Copy link

ajbouh commented Aug 26, 2021

Thanks for laying all that out. I am glad that honeycomb is generally agnostic to the choice of field names.

The "one configuration per dataset" approach that honeycomb takes seems to work best when all the instrumentation has been done by a single team/org that uses a well specified and homogeneous approach to event fields.

In our case, some spans are emitted by third party instrumentation and others are emitted by our own instrumentation logic. This means that a single concept (like http response) can have multiple names and it feels like there is no single configuration that can be applied to the entire dataset.

IMO having more guidance from honeycomb about what field names/meanings should be used would help us get the most from our instrumentation efforts.

@JamieDanielson
Copy link
Contributor

It seems like this conversation has strayed from the original intent of the issue discussion.

Can we move this over to the Honeycomb Pollinators Slack, maybe in the # discuss-opentelemetry channel? That would be a better place to also get feedback from others in the community as well.

@MikeGoldsmith
Copy link
Contributor

Closing as per #42

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: discussion Requests for comments, discussions about possible enhancements.
Projects
None yet
Development

No branches or pull requests

5 participants