Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use Rack::Events for instrumentation #342

Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
8698eb3
feat: Use Rack::Events for instrumentation
arielvalentin Feb 18, 2023
24ea563
squash: additional feature parity
arielvalentin Feb 18, 2023
06e4fe3
squash: add allowed response headers
arielvalentin Feb 20, 2023
725b722
squash: url quantization
arielvalentin Feb 20, 2023
8a2dd29
squash: Now with new Lemon Scented response headers
arielvalentin Feb 20, 2023
752a739
squash: we are now at parity
arielvalentin Feb 21, 2023
aba820e
squash: use instrumetation config
arielvalentin Feb 21, 2023
999594e
squash: Use declarative config options
arielvalentin Feb 22, 2023
5261676
squash: fix bad refactoring
arielvalentin Feb 25, 2023
e66a188
convert proxy span to an event
arielvalentin Mar 2, 2023
700f58f
refactor: move configurations to instrumentation install
arielvalentin Mar 2, 2023
7154ec7
squash: add test converage
arielvalentin Mar 2, 2023
69e60eb
squash: make response headers a little more resilient
arielvalentin Mar 2, 2023
006cea0
squash: Ensures event middleware will not cause the application to cr…
arielvalentin Mar 2, 2023
12c0e0f
squash: fix linter error
arielvalentin Mar 2, 2023
ee1c296
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 2, 2023
c37d534
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 2, 2023
ef52a8c
feat: Add middleware args helper for ActionPack and Sinatra
arielvalentin Mar 5, 2023
c53efcf
fix: test case
arielvalentin Mar 5, 2023
3aee1c3
fix: Rack Events is autoloaded so if the parent module is present so …
arielvalentin Mar 5, 2023
fb5e826
fix: More precise error handling
arielvalentin Mar 5, 2023
49f0bfe
fix: Ensure config is cleared/setup during installation
arielvalentin Mar 5, 2023
b7cad3f
fix: Sinatra 1.4 compatability
arielvalentin Mar 6, 2023
ce02dce
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 7, 2023
0e50bba
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 13, 2023
26c5a30
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 13, 2023
f134ec2
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 18, 2023
16a4896
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 19, 2023
205cd66
fix: bad merge
arielvalentin Mar 19, 2023
e9e837e
fix: invalid responses in test case
arielvalentin Mar 19, 2023
0a961a1
squash: Added a few more test cases
arielvalentin Mar 19, 2023
19d7c79
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Mar 31, 2023
e88b745
Merge branch 'main' into use-events-instead-of-middleware
arielvalentin Apr 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Metrics/AbcSize:
Enabled: false
Metrics/BlockLength:
Enabled: false
Metrics/ClassLength:
Enabled: false
Metrics/MethodLength:
Max: 25
Naming/FileName:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ module Rack
extend self

CURRENT_SPAN_KEY = Context.create_key('current-span')

private_constant :CURRENT_SPAN_KEY

# Returns the current span from the current or provided context
Expand Down Expand Up @@ -50,4 +49,5 @@ def with_span(span)
end

require_relative './rack/instrumentation'
require_relative './rack/util'
require_relative './rack/version'
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ module Rack
# instrumentation
class Instrumentation < OpenTelemetry::Instrumentation::Base
install do |_config|
# TODO: move logic that configures allow lists here
require_dependencies
end

Expand All @@ -29,12 +28,56 @@ class Instrumentation < OpenTelemetry::Instrumentation::Base
option :url_quantization, default: nil, validate: :callable
option :untraced_requests, default: nil, validate: :callable
option :response_propagators, default: [], validate: :array
# This option is only valid for applicaitons using Rack 2.0 or greater
option :use_rack_events, default: false, validate: :boolean

# Temporary Helper for Sinatra and ActionPack middleware to use during installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the long term plan (you may have answered this elsewhere)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration setting and helper will exist until...

  1. We drop support for Sinatra versions that require Rack 1.x
  2. A future release that makes the Event Handler the default way for users to instrument their applications

#
# @example Default usage
# Rack::Builder.new do
# use *OpenTelemetry::Instrumentation::Rack::Instrumenation.instance.middleware_args
# run lambda { |_arg| [200, { 'Content-Type' => 'text/plain' }, body] }
# end
# @return [Array] consisting of a middleware and arguments used in rack builders
def middleware_args
if config.fetch(:use_rack_events, false) == true && defined?(::Rack::Events)
[::Rack::Events, [OpenTelemetry::Instrumentation::Rack::Middlewares::EventHandler.new]]
else
[OpenTelemetry::Instrumentation::Rack::Middlewares::TracerMiddleware]
end
end

private

def require_dependencies
require_relative 'middlewares/event_handler' if defined?(Rack::Events)
require_relative 'middlewares/tracer_middleware'
end

def config_options(user_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertlaurin is config_options intended to be overridden? It looks OK to me, but I don't think we've explicitly called it out as OK for instrumentation to override.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fbogsany @robertlaurin Is this a blocker for merging and releasing the event-based implementation?

config = super(user_config)
config[:allowed_rack_request_headers] = config[:allowed_request_headers].compact.each_with_object({}) do |header, memo|
key = header.to_s.upcase.gsub(/[-\s]/, '_')
case key
when 'CONTENT_TYPE', 'CONTENT_LENGTH'
memo[key] = build_attribute_name('http.request.header.', header)
else
memo["HTTP_#{key}"] = build_attribute_name('http.request.header.', header)
end
end

config[:allowed_rack_response_headers] = config[:allowed_response_headers].each_with_object({}) do |header, memo|
memo[header] = build_attribute_name('http.response.header.', header)
memo[header.to_s.upcase] = build_attribute_name('http.response.header.', header)
end

config[:untraced_endpoints]&.compact!
config
end

def build_attribute_name(prefix, suffix)
prefix + suffix.to_s.downcase.gsub(/[-\s]/, '_')
end
end
end
end
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
# frozen_string_literal: true

# Copyright The OpenTelemetry Authors
#
# SPDX-License-Identifier: Apache-2.0

require_relative '../util'
require 'opentelemetry/trace/status'

module OpenTelemetry
module Instrumentation
module Rack
module Middlewares
# OTel Rack Event Handler
#
# This seeds the root context for this service with the server span as the `current_span`
# allowing for callers later in the stack to reference it using {OpenTelemetry::Trace.current_span}
#
# It also registers the server span in a context dedicated to this instrumentation that users may look up
# using {OpenTelemetry::Instrumentation::Rack.current_span}, which makes it possible for users to mutate the span,
# e.g. add events or update the span name like in the {ActionPack} instrumentation.
#
# @example Rack App Using BodyProxy
# GLOBAL_LOGGER = Logger.new($stderr)
# APP_TRACER = OpenTelemetry.tracer_provider.tracer('my-app', '1.0.0')
#
# Rack::Builder.new do
# use Rack::Events, [OpenTelemetry::Instrumentation::Rack::Middlewares::EventHandler.new]
# run lambda { |_arg|
# APP_TRACER.in_span('hello-world') do |_span|
# body = Rack::BodyProxy.new(['hello world!']) do
# rack_span = OpenTelemetry::Instrumentation::Rack.current_span
# GLOBAL_LOGGER.info("otel.trace_id=#{rack_span.context.hex_trace_id} otel.span_id=#{rack_span.context.hex_span_id}")
# end
# [200, { 'Content-Type' => 'text/plain' }, body]
# end
# }
# end
#
# @see Rack::Events
# @see OpenTelemetry::Instrumentation::Rack.current_span
class EventHandler
include ::Rack::Events::Abstract

TOKENS_KEY = 'otel.context.tokens'
GOOD_HTTP_STATUSES = (100..499).freeze

# Creates a server span for this current request using the incoming parent context
# and registers them as the {current_span}
#
# @param [Rack::Request] The current HTTP request
# @param [Rack::Response] This is nil in practice
# @return [void]
def on_start(request, _)
return if untraced_request?(request.env)

parent_context = extract_remote_context(request)
span = create_span(parent_context, request)
request.env[TOKENS_KEY] = register_current_span(span)
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end

# Optionally adds debugging response headers injected from {response_propagators}
#
# @param [Rack::Request] The current HTTP request
# @param [Rack::Response] This current HTTP response
# @return [void]
def on_commit(request, response)
span = OpenTelemetry::Instrumentation::Rack.current_span
return unless span.recording?

response_propagators&.each do |propagator|
propagator.inject(response.headers)
rescue StandardError => e
OpenTelemetry.handle_error(message: 'Unable to inject response propagation headers', exception: e)
end
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end

# Records Unexpected Exceptions on the Rack span and set the Span Status to Error
#
# @note does nothing if the span is a non-recording span
# @param [Rack::Request] The current HTTP request
# @param [Rack::Response] The current HTTP response
# @param [Exception] An unxpected error raised by the application
def on_error(request, _, error)
span = OpenTelemetry::Instrumentation::Rack.current_span
return unless span.recording?

span.record_exception(error)
span.status = OpenTelemetry::Trace::Status.error
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end

# Finishes the span making it eligible to be exported and cleans up existing contexts
#
# @note does nothing if the span is a non-recording span
# @param [Rack::Request] The current HTTP request
# @param [Rack::Response] The current HTTP response
def on_finish(request, response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be nice to add on_send and record a span event for "sent first byte" or something similar. It is really nice to know, from a perf analysis perspective, when the application has finalized the response and the remaining work is "out of our hands".

An interesting philosophical question: when should we consider the server span "finished"? When application processing is complete or when the response has been completely written to the socket? I think the current middleware implementation is effectively equivalent to "when application processing is complete", before anything has been written to the socket.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be nice to add on_send and record a span event for "sent first byte" or something similar. It is really nice to know, from a perf analysis perspective, when the application has finalized the response and the remaining work is "out of our hands".

I like this idea a lot! I will add it to my todo list.

An interesting philosophical question: when should we consider the server span "finished"? When application processing is complete or when the response has been completely written to the socket? I think the current middleware implementation is effectively equivalent to "when application processing is complete", before anything has been written to the socket.

  • 💭 I'm not sure. I lean towards after the middleware has completed writing the response but does that make it asymmetric from when it's processing the incoming request? I'm not entirely certain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to hold off adding other events until I get this initial implementation merged.

I am also going to stick with on_finish, i.e. after the response has been written to the socket and the Rack::BufferedResponse is closed.

span = OpenTelemetry::Instrumentation::Rack.current_span
return unless span.recording?

add_response_attributes(span, response) if response
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
ensure
detach_contexts(request)
end

private

EMPTY_HASH = {}.freeze
def extract_request_headers(env)
return EMPTY_HASH if allowed_request_headers.empty?

allowed_request_headers.each_with_object({}) do |(key, value), result|
result[value] = env[key] if env.key?(key)
end
end

def extract_response_attributes(response)
attributes = { 'http.status_code' => response.status.to_i }
attributes.merge!(extract_response_headers(response.headers))
attributes
end

def extract_response_headers(headers)
return EMPTY_HASH if allowed_response_headers.empty?

allowed_response_headers.each_with_object({}) do |(key, value), result|
if headers.key?(key)
result[value] = headers[key]
else
# do case-insensitive match:
headers.each do |k, v|
if k.upcase == key
result[value] = v
break
end
end
end
end
end

def untraced_request?(env)
return true if untraced_endpoints.include?(env['PATH_INFO'])
return true if untraced_requests&.call(env)

false
end

# https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md#name
#
# recommendation: span.name(s) should be low-cardinality (e.g.,
# strip off query param value, keep param name)
#
# see http://github.com/open-telemetry/opentelemetry-specification/pull/416/files
def create_request_span_name(request)
# NOTE: dd-trace-rb has implemented 'quantization' (which lowers url cardinality)
# see Datadog::Quantization::HTTP.url

if (implementation = url_quantization)
request_uri_or_path_info = request.env['REQUEST_URI'] || request.path_info
implementation.call(request_uri_or_path_info, request.env)
else
"HTTP #{request.request_method}"
end
end

def extract_remote_context(request)
OpenTelemetry.propagation.extract(
request.env,
getter: OpenTelemetry::Common::Propagation.rack_env_getter
)
end

def request_span_attributes(env)
attributes = {
'http.method' => env['REQUEST_METHOD'],
'http.host' => env['HTTP_HOST'] || 'unknown',
'http.scheme' => env['rack.url_scheme'],
'http.target' => env['QUERY_STRING'].empty? ? env['PATH_INFO'] : "#{env['PATH_INFO']}?#{env['QUERY_STRING']}"
}

attributes['http.user_agent'] = env['HTTP_USER_AGENT'] if env['HTTP_USER_AGENT']
attributes.merge!(extract_request_headers(env))
attributes
end

def detach_contexts(request)
request.env[TOKENS_KEY]&.reverse&.each do |token|
OpenTelemetry::Context.detach(token)
OpenTelemetry::Trace.current_span.finish
end
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end

def add_response_attributes(span, response)
span.status = OpenTelemetry::Trace::Status.error unless GOOD_HTTP_STATUSES.include?(response.status.to_i)
attributes = extract_response_attributes(response)
span.add_attributes(attributes)
rescue StandardError => e
OpenTelemetry.handle_error(exception: e)
end

def record_frontend_span?
config[:record_frontend_span] == true
end

def untraced_endpoints
config[:untraced_endpoints]
end

def untraced_requests
config[:untraced_requests]
end

def url_quantization
config[:url_quantization]
end

def response_propagators
config[:response_propagators]
end

def allowed_request_headers
arielvalentin marked this conversation as resolved.
Show resolved Hide resolved
config[:allowed_rack_request_headers]
end

def allowed_response_headers
config[:allowed_rack_response_headers]
end
Comment on lines +211 to +237
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to cache all these config options in instance variables?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing instrumentation supports being able to inject the middleware and then initializing the SDK afterwards.

Using instance variables made it so the instance in the middleware had stale values and did not reflect the values after the instrumentation was installed so I didn't have a choice but to look them up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that caching config option values in instance variables is not a blocker to adding the event-based implementation. 👍🏻

I think caching (I'm assuming "via memoization" here) is something we could consider doing in a future release if/when the middleware-based implementation is removed.


def tracer
OpenTelemetry::Instrumentation::Rack::Instrumentation.instance.tracer
end

def config
OpenTelemetry::Instrumentation::Rack::Instrumentation.instance.config
end

def register_current_span(span)
ctx = OpenTelemetry::Trace.context_with_span(span)
rack_ctx = OpenTelemetry::Instrumentation::Rack.context_with_span(span, parent_context: ctx)

contexts = [ctx, rack_ctx]
contexts.compact!
contexts.map { |context| OpenTelemetry::Context.attach(context) }
end

def create_span(parent_context, request)
span = tracer.start_span(
create_request_span_name(request),
with_parent: parent_context,
kind: :server,
attributes: request_span_attributes(request.env)
)
request_start_time = OpenTelemetry::Instrumentation::Rack::Util::QueueTime.get_request_start(request.env)
span.add_event('http.proxy.request.started', timestamp: request_start_time) unless request_start_time.nil?
span
end
end
end
end
end
end
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@

require 'opentelemetry/trace/status'

require_relative '../util/queue_time'

module OpenTelemetry
module Instrumentation
module Rack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to update this to use the config[:allowed_rack_request_headers] and config[:allowed_rack_response_headers] precomputed in the Instrumentation class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about changing too many things at once in this PR.

I am certainly amenable to removing the duplication here.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# frozen_string_literal: true

# Copyright The OpenTelemetry Authors
#
# SPDX-License-Identifier: Apache-2.0

module OpenTelemetry
module Instrumentation
module Rack
# Provides utilities methods for creating Rack spans
module Util
require_relative 'util/queue_time'
end
end
end
end
Loading