Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use Rack::Events for instrumentation #342

Conversation

arielvalentin
Copy link
Collaborator

@arielvalentin arielvalentin commented Feb 18, 2023

This is an alternate implementation of the Rack instrumentation that leverages the Rack::Events API instead of a customRack::Middleware.

Why am I suggesting we change this instrumentation? At GitHub we leverage Rack::BodyProxy to write Rack Request logs after the request is complete, however Rack span is already finished and its related Context has already been detached. This means we are not able to correlate request logs to our traces.

The advantage of using Rack::Events is that handlers are triggered during difference stages of a request, including for deferred operations like Rack::BodyProxy, as opposed to middlewares who only are invoked inline.

The disadvantage of this API is that it makes managing the request more difficult and we have to track the Context positions to detach in the Rack::Env.

This implementation will be released along side the existing instrumtation to give users the option to use the middleware instead of the Rack::Events handler until we are able to run this in some of our heavy production workloads.

Other noteworthy things:

  1. The http.proxy span has been converted to a Span Event and ignores the record_frontend_span option
  2. There are no new events added that we brainstormed in this discussion: feat: Use Rack::Events for instrumentation #342 (comment)
  3. All configuration options for this middleware that required special processing/registration were moved into the Rack::Instrumentation.install
  4. The Rack event now includes the time it takes to write the response to the socket.
  5. Sinatra 1.4 depends on Rack 1 so we must support both Rack Middleware and Events unless we choose to drop support for it.

Fixes #341 Related DataDog/dd-trace-rb#1746

# @note does nothing if the span is a non-recording span
# @param [Rack::Request] The current HTTP request
# @param [Rack::Response] The current HTTP response
def on_finish(request, response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be nice to add on_send and record a span event for "sent first byte" or something similar. It is really nice to know, from a perf analysis perspective, when the application has finalized the response and the remaining work is "out of our hands".

An interesting philosophical question: when should we consider the server span "finished"? When application processing is complete or when the response has been completely written to the socket? I think the current middleware implementation is effectively equivalent to "when application processing is complete", before anything has been written to the socket.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be nice to add on_send and record a span event for "sent first byte" or something similar. It is really nice to know, from a perf analysis perspective, when the application has finalized the response and the remaining work is "out of our hands".

I like this idea a lot! I will add it to my todo list.

An interesting philosophical question: when should we consider the server span "finished"? When application processing is complete or when the response has been completely written to the socket? I think the current middleware implementation is effectively equivalent to "when application processing is complete", before anything has been written to the socket.

  • 💭 I'm not sure. I lean towards after the middleware has completed writing the response but does that make it asymmetric from when it's processing the incoming request? I'm not entirely certain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to hold off adding other events until I get this initial implementation merged.

I am also going to stick with on_finish, i.e. after the response has been written to the socket and the Rack::BufferedResponse is closed.

This is an alternate implementation of the Rack instrumentation that leverages the [`Rack::Events` API](https://www.rubydoc.info/gems/rack/Rack/Events) instead of a custom`Rack::Middleware`.

Why am I suggesting we change this instrumentation? At GitHub we leverage `Rack::BodyProxy` to write Rack Request logs after the request is complete, however Rack span is already finished and its related `Context` has already been detached. This means we are not able to correlate request logs to our traces.

The advantage of using `Rack::Events` is that handlers are triggered during difference stages of a request, including for deferred operations like [`Rack::BodyProxy`](https://www.rubydoc.info/gems/rack/Rack/BodyProxy), as opposed to middlewares who _only_ are invoked inline.

The disadvantage of this API is that it makes managing the request more difficult and we have to track the Context positions to detach in the `Rack::Env`.

This implementation will be released along side the existing instrumtation to give users the option to use the middleware instead of the `Rack::Events` handler until we are able to run this in some of our heavy production workloads.

Fixes open-telemetry#341
Related DataDog/dd-trace-rb#1746
@arielvalentin arielvalentin force-pushed the use-events-instead-of-middleware branch from 2272df2 to 5261676 Compare March 2, 2023 01:59
@arielvalentin arielvalentin marked this pull request as ready for review March 6, 2023 14:25
@arielvalentin
Copy link
Collaborator Author

Tests are failing for combination of rack >= 2.2 and rack test 2.1

@arielvalentin
Copy link
Collaborator Author

I have opened an issue with rack-test rack/rack-test#335

Hopefully I will be able to dig in something soon and see what is going on.

@arielvalentin
Copy link
Collaborator Author

jeremyevans set me straight. The test case setup was wrong.

@arielvalentin arielvalentin enabled auto-merge (squash) March 19, 2023 22:59
# This option is only valid for applicaitons using Rack 2.0 or greater
option :use_rack_events, default: false, validate: :boolean

# Temporary Helper for Sinatra and ActionPack middleware to use during installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the long term plan (you may have answered this elsewhere)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration setting and helper will exist until...

  1. We drop support for Sinatra versions that require Rack 1.x
  2. A future release that makes the Event Handler the default way for users to instrument their applications

require_relative 'middlewares/tracer_middleware'
end

def config_options(user_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertlaurin is config_options intended to be overridden? It looks OK to me, but I don't think we've explicitly called it out as OK for instrumentation to override.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fbogsany @robertlaurin Is this a blocker for merging and releasing the event-based implementation?

@@ -6,8 +6,6 @@

require 'opentelemetry/trace/status'

require_relative '../util/queue_time'

module OpenTelemetry
module Instrumentation
module Rack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to update this to use the config[:allowed_rack_request_headers] and config[:allowed_rack_response_headers] precomputed in the Instrumentation class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about changing too many things at once in this PR.

I am certainly amenable to removing the duplication here.

Comment on lines +211 to +237
def record_frontend_span?
config[:record_frontend_span] == true
end

def untraced_endpoints
config[:untraced_endpoints]
end

def untraced_requests
config[:untraced_requests]
end

def url_quantization
config[:url_quantization]
end

def response_propagators
config[:response_propagators]
end

def allowed_request_headers
config[:allowed_rack_request_headers]
end

def allowed_response_headers
config[:allowed_rack_response_headers]
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to cache all these config options in instance variables?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing instrumentation supports being able to inject the middleware and then initializing the SDK afterwards.

Using instance variables made it so the instance in the middleware had stale values and did not reflect the values after the instrumentation was installed so I didn't have a choice but to look them up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that caching config option values in instance variables is not a blocker to adding the event-based implementation. 👍🏻

I think caching (I'm assuming "via memoization" here) is something we could consider doing in a future release if/when the middleware-based implementation is removed.

@robbkidd robbkidd self-assigned this Mar 31, 2023
Copy link
Member

@robbkidd robbkidd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hangs together, new behavior is behind a feature flag config option, old behavior works in the handful of Racky test apps I have.

# in a Gemfile
gem 'opentelemetry-instrumentation-rack', git: 'https://github.com/open-telemetry/opentelemetry-ruby-contrib', glob: 'instrumentation/*/*.gemspec'

I'd like to see the instrumentation's options appear documented in the gem's README, but (1) of the existing options, only url_quantization is mentioned so this is not really a hard criteria for this new option and (2) I can be the change I want to see in the world (I'll open a PR updating the README).

@arielvalentin arielvalentin merged commit c179d3b into open-telemetry:main Apr 5, 2023
@arielvalentin arielvalentin deleted the use-events-instead-of-middleware branch April 5, 2023 19:23
@arielvalentin
Copy link
Collaborator Author

OOOOOOOOPPPPPPPPPPPPSSSSSS I had set this to auto-merge by accident. I hope that is not a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rack Span not available in Rack::BodyProxy
4 participants