Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get HTTP Context Propagation to work #4247

Closed
tglanz opened this issue Nov 3, 2023 · 19 comments
Closed

Cannot get HTTP Context Propagation to work #4247

tglanz opened this issue Nov 3, 2023 · 19 comments
Assignees
Labels
bug Something isn't working triage

Comments

@tglanz
Copy link

tglanz commented Nov 3, 2023

What happened?

Hi, I have set up OTEL tracing but can't get context propagation to work.

In my setup I have a web app that interacts with an express service that calls another express service. The services export traces to using Grpc a local Collector and are then exported to Jaeger. I can see the spans are exported fine, contain information from the http and express instrumentation libraries but context hasn't been propagated - The relevant spans don't have a parent and the relevant HTTP requests don't contain the HTTP Headers that are needed.

The example project is here: https://github.com/tglanz/simple-tracing. The components are:

Web --> Die Roller --> Randomizer

The OTEL is being set up automatically by running node with some relevant OTEL env vars and specifically NODE_OPTIONS="--require @opentelemetry/auto-instrumentations-node/register".

Note that I also tried initializing OTEL manually with the same results.

In the example repository, running docker compose up will deploy the Web App at http://localhost:5173, the Die Roller at http://localhost:8000, the Randomizer service, a Collector with an OTLP Grpc receiver at http://localhost:4173 and a Jaeger FE at http://localhost:16686.

The collector logs (attached) don't show any error that looks relevant.

Is there any issue here or do I miss something basic?
Thanks!

OpenTelemetry Setup Code

Run the first service using node with the env variables below

export OTEL_TRACES_EXPORTER="otlp"
export OTEL_EXPORTER_OTLP_PROTOCOL="grpc"
export OTEL_EXPORTER_OTLP_COMPRESSION="gzip"
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://collector:4317"
export OTEL_EXPORTER_OTLP_HEADERS=""
export OTEL_EXPORTER_OTLP_TRACES_HEADERS=""
export OTEL_RESOURCE_ATTRIBUTES=""
export OTEL_NODE_RESOURCE_DETECTORS="env,host,os"
export OTEL_SERVICE_NAME="die-roller"
export OTEL_LOG_LEVEL="info"
export OTEL_PROPAGATORS=b3,tracecontext,baggage
export OTEL_TRACES_SAMPLES=always_on
export NODE_OPTIONS="--require @opentelemetry/auto-instrumentations-node/register"

Run the second service using similar env vars but only the service name is changed.

package.json

{
  "name": "service-a",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "type": "module",
  "scripts": {
    "start": "node server.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@opentelemetry/api": "^1.6.0",
    "@opentelemetry/auto-instrumentations-node": "^0.39.4",
    "cors": "^2.8.5",
    "express": "^4.18.2",
    "pino-http": "^8.5.0",
    "pino-pretty": "^10.2.3"
  }
}

Relevant log output

The collector log was to big for putting in the issue and can be found here: https://github.com/raw/tglanz/simple-tracing/main/service/collector.log

@tglanz tglanz added bug Something isn't working triage labels Nov 3, 2023
@david-luna
Copy link
Contributor

This issue comes from a thread in CNCF's slack
https://cloud-native.slack.com/archives/C01NL1GRPQR/p1699010105287119

In that conversation it was pointed out that the sample project uses ESM imports (kudos to @RichiCoder1) and there are some PRs for handling it on specific packages
open-telemetry/opentelemetry-js-contrib#1736
open-telemetry/opentelemetry-js-contrib#1701
open-telemetry/opentelemetry-js-contrib#1694

We should probably check this 1st.

@RichiCoder1
Copy link

I believe @opentelemetry/auto-instrumentations-node importantly is missing an ESM-compatible hook loader like described here: https://github.com/open-telemetry/opentelemetry-js/tree/main/experimental/packages/opentelemetry-instrumentation#instrumentation-for-es-modules-in-nodejs-experimental

@Flarna
Copy link
Member

Flarna commented Nov 7, 2023

There is no additional/different hook needed.
@opentelemetry/auto-instrumentations-node is just a metapackage collecting a "random" set of instrumentations. All of them depend on @opentelemetry/instrumentations.

Just use the hook from @opentelemetry/instrumentations if you use ESM and it should work independent if you use the meta package or individual, selected instrumentations.

@david-luna
Copy link
Contributor

@Flarna do you think we should add a note in the automatic instrumentation docs?

https://opentelemetry.io/docs/instrumentation/js/automatic/

@Flarna
Copy link
Member

Flarna commented Nov 7, 2023

Not sure.
It would be needed to add it for any instrumentation targeting node.js as auto-instrumentations-node is just one possible entry point.
Main problem is likely that we still have no good over all doc entry point which would be likely a better place to document this along with all the other hints how to setup/configure OTel.

@tglanz
Copy link
Author

tglanz commented Nov 7, 2023

@Flarna it looks like the context isn't being propagated still after adding the @opentelemetry/instrumentation hook

@RichiCoder1
Copy link

There is no additional/different hook needed.

That's not correct, no? It's true that @opentelemetry/auto-instrumentations-node is a meta-package, but the recommended way to use it is via --require @opentelemetry/auto-instrumentations-node/register which won't work with ESM by default since it doesn't add in the ESM hook. You'd need to (in theory) to do:

NODE_OPTIONS="--experimental-loader=@opentelemetry/instrumentation/hook.mjs" node --require @opentelemetry/auto-instrumentations-node/register index.js

Though I'm sure there's caveats with that.

@Flarna
Copy link
Member

Flarna commented Nov 8, 2023

As of now the ESM hooks needs to be specify additionally to the register hook to initialize the SDK.
ESM hooks are still in flux and not marked as stable on node.js side. The current recommended API module.register was added in 20.6.0 and the --import command line argument in 18.18.0 (still experimental). node.js issue tracer also has some open discussions in that regard.
Therefore it will likely take a while until stable monitoring of ESM apps is a thing.

Anyhow, maybe someone with more ESM experience can attach a debugger to take a look what exactly is not working in this sample if both hooks are specified.
I assume details depend a lot on the exact setup like node.js version used, other loader hooks used, sequence how these hooks are specified,...

@david-luna
Copy link
Contributor

david-luna commented Nov 14, 2023

Note to self: traces have spans for the express middlewares. Let's stat looking at that instrumentation

Screenshot 2023-11-14 at 12 06 01

@david-luna
Copy link
Contributor

david-luna commented Nov 14, 2023

Fetch instrumentation (which is used to call between Dice Roller & Randomizer services) is not part of @opentelemetry/auto-instrumentations-node according to its docs

https://www.npmjs.com/package/@opentelemetry/auto-instrumentations-node

Switching to use plain http.request to check if traceparent header is added to the request

@trentm
Copy link
Contributor

trentm commented Nov 14, 2023

Ah, good find.

A possible point of confusion is that there are a number of fetch implementations:

  1. The browser globalThis.fetch, which I think the @opentelemetry/instrumentation-fetch package (part of the opentelemetry-js repo) instruments.
  2. https://github.com/node-fetch/node-fetch, which I think the repro "simple-tracing.git" repo is using.
  3. undici.fetch() (https://undici.nodejs.org/#/?id=undicifetchinput-init-promise)
  4. Node.js's global fetch() (https://nodejs.org/api/all.html#all_globals_fetch), which uses a bundled undici under the hood.

The latter two, I believe, can be instrumented by the 3rd party https://github.com/gadget-inc/opentelemetry-instrumentations/tree/main/packages/opentelemetry-instrumentation-undici package.

I don't know about node-fetch instrumentation. I haven't looked what it is using for HTTP client communication.

@Flarna
Copy link
Member

Flarna commented Nov 14, 2023

As far as I know node-fetch uses node.js internal http module so @opentelemetry/instrumentation-http should work.
I'm not sure if undici instrumentation covers (4) because the undici version used in node is bundled into a builtin as far as I know.

There is open-telemetry/opentelemetry-js-contrib#1021 which has some more references.

@trentm
Copy link
Contributor

trentm commented Nov 14, 2023

2. https://github.com/node-fetch/node-fetch, which I think the repro "simple-tracing.git" repo is using.

I was wrong. node-fetch is in its package-lock.json, but it is actually using Node.js's global fetch(). My bad.

So that means it would need instrumentation for that fetch. There are these two 3rd-party instrumentations which theoretically should work, but I haven't used them:

@trentm
Copy link
Contributor

trentm commented Nov 15, 2023

@tglanz See the previous comment. IIUC, your code is using Node.js's core fetch() to make its HTTP calls between services:
https://github.com/tglanz/simple-tracing/blob/3be5b3a87622a603c23a18ba392fdcb2b70680dd/service/server.js#L24

The set of instrumentations included with "@opentelemetry/auto-instrumentations-node" do not instrument Node.js's core fetch(). There are a couple third-party packages that do. Could you try using one of those packages?

@tglanz
Copy link
Author

tglanz commented Nov 15, 2023

ofc, will try that soon @trentm

@tglanz
Copy link
Author

tglanz commented Nov 15, 2023

@trentm @david-luna

Switched to node's http and it works properly - context is propagated well. OTEL setup is one of the following

  • Using a custom setup with --import ./manual-otel.js as in this commit
  • Using standard setup with --require @opentelemetry/auto-instrumentations-node/register as in this commit

According to your suggestion, I tried to instrument fetch with 3rd party packages with the results:

  • Using opentelemetry-instrumentation-undici fetch wasn't instrumented and context propagation did not work. In the headers I did see "user-agent: undici". This setup can be found in this commit
  • Using opentelemetry-instrumentation-node-fetch fetch was instrumented but context propagation did not work. It is interesting to note that again the "user-agent: undici" header exist (like you said about it using undici) and also saw that the "traceparent" header existed - but with different trace id than it should (at least up to what I could figure). This setup can be found in this commit. b3 didn't exist.

@david-luna
Copy link
Contributor

david-luna commented Nov 21, 2023

@tglanz

That fetch instrumentation is targeted for web platform instead of node since it has @opentelemetry/sdk-trace-web as a dependency

There is this discussion with a custom instrumentation for fetch that may help you get the tracing working
#3346

I wonder if this should create a instrumentation for fetch in this repo. As a customer I'd like to have Node APIs like fetch instrumented 🤔

@david-luna
Copy link
Contributor

david-luna commented Nov 29, 2023

@tglanz

Since this is not a bug but a missing instrumentation I've created a new feature request to add a fetch instrumentation package. (#4333)

I guess we can close this one and continue the conversation there

cc @pichlermarc
cc @dyladan

@tglanz
Copy link
Author

tglanz commented Dec 11, 2023

thanks, closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

5 participants