Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: move telemetry events sending out of the critical path to a background thread #3740

Merged
merged 24 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
70329a9
add signatures and enable type checking for Core::Telemetry module
anmarchenko Jun 12, 2024
890fc9b
rename Telemetry::Heartbeat to Telemetry::Worker
anmarchenko Jun 12, 2024
eb73019
Add queue to telemetry worker. Move sending heartbeat logic to the te…
anmarchenko Jun 13, 2024
aea0862
move AppStarted telemetry event out of the critical path
anmarchenko Jun 13, 2024
ca1d9ae
fix failing tests by waiting for worker startup
anmarchenko Jun 14, 2024
1e5b92d
debug logging, flushing events, attempt at fixing failing test for wo…
anmarchenko Jun 14, 2024
95142d9
don't send heartbeat event if started event wasn't successfully sent
anmarchenko Jun 14, 2024
9719706
fix client_spec
anmarchenko Jun 14, 2024
07f1356
enqueue events to be sent later by worker instead of sending them syn…
anmarchenko Jun 14, 2024
c053392
rename Telemetry::Client to Telemetry::Component to better reflect it…
anmarchenko Jun 14, 2024
0ad3d82
leftover of telemetry component rename
anmarchenko Jun 14, 2024
253fd4f
add Core::Utils::OnlyOnceSuccessful to execute code with only one suc…
anmarchenko Jun 17, 2024
7c8d154
ensure that app-started event is sent at most once, flush events befo…
anmarchenko Jun 17, 2024
9498ebc
remove Telemetry::Component.started! as right now it just contains de…
anmarchenko Jun 17, 2024
5347a09
change the wrong expectation in workers spec
anmarchenko Jun 17, 2024
32f1208
send app-dependencies-loaded event right after app-started event in t…
anmarchenko Jun 17, 2024
b9ed222
add limit option to OnlyOnceSuccessful util
anmarchenko Jun 21, 2024
929316a
limit telemetry app-started event retries
anmarchenko Jun 21, 2024
53f11ea
do not instantiate empty array every time when sending events
anmarchenko Jun 21, 2024
098d198
lower HTTP timeout for telemetry worker
anmarchenko Jun 25, 2024
df0f8f8
remove merge artifacts
anmarchenko Jun 26, 2024
3c42a26
minor fix for older rubies
anmarchenko Jun 26, 2024
e71b3a1
ignore ethon and httprb for jruby
anmarchenko Jun 26, 2024
fdfc830
positive? does not exist on older rubies
anmarchenko Jun 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ TEST_METADATA = {
'elasticsearch-8' => '❌ 2.1 / ❌ 2.2 / ❌ 2.3 / ❌ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ jruby'
},
'ethon' => {
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / jruby'
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / jruby'
},
'excon' => {
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ jruby'
Expand Down Expand Up @@ -111,7 +111,7 @@ TEST_METADATA = {
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ jruby'
},
'httprb' => {
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / jruby'
'http' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / jruby'
},
'kafka' => {
'activesupport' => '✅ 2.1 / ✅ 2.2 / ✅ 2.3 / ✅ 2.4 / ✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ jruby'
Expand Down
19 changes: 0 additions & 19 deletions Steepfile
Original file line number Diff line number Diff line change
Expand Up @@ -93,25 +93,6 @@ target :ddtrace do
ignore 'lib/datadog/core/pin.rb'
ignore 'lib/datadog/core/runtime/ext.rb'
ignore 'lib/datadog/core/runtime/metrics.rb'
ignore 'lib/datadog/core/telemetry/client.rb'
ignore 'lib/datadog/core/telemetry/collector.rb'
ignore 'lib/datadog/core/telemetry/emitter.rb'
ignore 'lib/datadog/core/telemetry/event.rb'
ignore 'lib/datadog/core/telemetry/ext.rb'
ignore 'lib/datadog/core/telemetry/heartbeat.rb'
ignore 'lib/datadog/core/telemetry/http/adapters/net.rb'
ignore 'lib/datadog/core/telemetry/http/env.rb'
ignore 'lib/datadog/core/telemetry/http/ext.rb'
ignore 'lib/datadog/core/telemetry/http/response.rb'
ignore 'lib/datadog/core/telemetry/http/transport.rb'
ignore 'lib/datadog/core/telemetry/v1/app_event.rb'
ignore 'lib/datadog/core/telemetry/v1/application.rb'
ignore 'lib/datadog/core/telemetry/v1/configuration.rb'
ignore 'lib/datadog/core/telemetry/v1/dependency.rb'
ignore 'lib/datadog/core/telemetry/v1/host.rb'
ignore 'lib/datadog/core/telemetry/v1/integration.rb'
ignore 'lib/datadog/core/telemetry/v1/product.rb'
ignore 'lib/datadog/core/telemetry/v1/telemetry_request.rb'
ignore 'lib/datadog/core/transport/ext.rb'
ignore 'lib/datadog/core/transport/http/adapters/net.rb'
ignore 'lib/datadog/core/transport/http/adapters/registry.rb'
Expand Down
20 changes: 3 additions & 17 deletions lib/datadog/core/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -81,23 +81,16 @@ def configure
configuration = self.configuration
yield(configuration)

built_components = false

components = safely_synchronize do |write_components|
safely_synchronize do |write_components|
write_components.call(
if components?
replace_components!(configuration, @components)
else
components = build_components(configuration)
built_components = true
components
build_components(configuration)
end
)
end

# Should only be called the first time components are built
components.telemetry.started! if built_components

configuration
end

Expand Down Expand Up @@ -197,20 +190,13 @@ def components(allow_initialization: true)
current_components = COMPONENTS_READ_LOCK.synchronize { defined?(@components) && @components }
return current_components if current_components || !allow_initialization

built_components = false

components = safely_synchronize do |write_components|
safely_synchronize do |write_components|
if defined?(@components) && @components
@components
else
built_components = true
write_components.call(build_components(configuration))
end
end

# Should only be called the first time components are built
components.telemetry.started! if built_components && components && components.telemetry
components
end

private
Expand Down
7 changes: 4 additions & 3 deletions lib/datadog/core/configuration/components.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
require_relative '../diagnostics/health'
require_relative '../logger'
require_relative '../runtime/metrics'
require_relative '../telemetry/client'
require_relative '../telemetry/component'
require_relative '../workers/runtime_metrics'

require_relative '../remote/component'
Expand Down Expand Up @@ -60,7 +60,7 @@ def build_telemetry(settings, agent_settings, logger)
logger.debug { "Telemetry disabled. Agent network adapter not supported: #{agent_settings.adapter}" }
end

Telemetry::Client.new(
Telemetry::Component.new(
enabled: enabled,
heartbeat_interval_seconds: settings.telemetry.heartbeat_interval_seconds,
dependency_collection: settings.telemetry.dependency_collection
Expand Down Expand Up @@ -165,8 +165,9 @@ def shutdown!(replacement = nil)
unused_statsd = (old_statsd - (old_statsd & new_statsd))
unused_statsd.each(&:close)

telemetry.stop!
# enqueue closing event before stopping telemetry so it will be send out on shutdown
telemetry.emit_closing! unless replacement
telemetry.stop!
end
end
end
Expand Down
95 changes: 0 additions & 95 deletions lib/datadog/core/telemetry/client.rb

This file was deleted.

66 changes: 66 additions & 0 deletions lib/datadog/core/telemetry/component.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# frozen_string_literal: true

require_relative 'emitter'
require_relative 'event'
require_relative 'worker'
require_relative '../utils/forking'

module Datadog
module Core
module Telemetry
# Telemetry entrypoint, coordinates sending telemetry events at various points in app lifecycle.
class Component
attr_reader :enabled

include Core::Utils::Forking

# @param enabled [Boolean] Determines whether telemetry events should be sent to the API
# @param heartbeat_interval_seconds [Float] How frequently heartbeats will be reported, in seconds.
# @param [Boolean] dependency_collection Whether to send the `app-dependencies-loaded` event
def initialize(heartbeat_interval_seconds:, dependency_collection:, enabled: true)
@enabled = enabled
@stopped = false

@worker = Telemetry::Worker.new(
enabled: @enabled,
heartbeat_interval_seconds: heartbeat_interval_seconds,
emitter: Emitter.new,
dependency_collection: dependency_collection
)
@worker.start
end

def disable!
@enabled = false
@worker.enabled = false
end

def stop!
return if @stopped

@worker.stop(true)
@stopped = true
end

def emit_closing!
return if !@enabled || forked?

@worker.enqueue(Event::AppClosing.new)
end

def integrations_change!
return if !@enabled || forked?

@worker.enqueue(Event::AppIntegrationsChange.new)
end

# Report configuration changes caused by Remote Configuration.
def client_configuration_change!(changes)
return if !@enabled || forked?

@worker.enqueue(Event::AppClientConfigurationChange.new(changes, 'remote_config'))
end
end
end
end
end
1 change: 1 addition & 0 deletions lib/datadog/core/telemetry/event.rb
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ def payload(seq_id)
private

def products
# @type var products: Hash[Symbol, Hash[Symbol, Object]]
products = {
appsec: {
enabled: Datadog::AppSec.enabled?,
Expand Down
33 changes: 0 additions & 33 deletions lib/datadog/core/telemetry/heartbeat.rb

This file was deleted.

2 changes: 1 addition & 1 deletion lib/datadog/core/telemetry/http/adapters/net.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class Net
:timeout,
:ssl

DEFAULT_TIMEOUT = 30
DEFAULT_TIMEOUT = 2

def initialize(hostname:, port: nil, timeout: DEFAULT_TIMEOUT, ssl: true)
@hostname = hostname
Expand Down
Loading
Loading