-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect Ruby VM internal event metrics (object allocation) #2164
Labels
feature-request
A request for a new feature or change to an existing one
Comments
marcotc
added
the
feature-request
A request for a new feature or change to an existing one
label
Jul 19, 2022
Nice write-up!
|
ivoanjo
added a commit
that referenced
this issue
Feb 20, 2023
**What does this PR do?**: This PR adds a new profiler public API: `Datadog::Profiling.allocation_count`. The public documentation for this API is as follows: > Returns an ever-increasing counter of the number of allocations > observed by the profiler in this thread. > > Note 1: This counter may not start from zero on new threads. It > should only be used to measure how many > allocations have happened between two calls to this API: > ```ruby > allocations_before = Datadog::Profiling.allocation_count > do_some_work() > allocations_after = Datadog::Profiling.allocation_count > puts "Allocations during do_some_work: #{allocations_after - allocations_before}" > ``` > (This is similar to some OS-based time representations.) > > Note 2: All fibers in the same thread will share the same counter > values. > > Only available when the profiler is running, the new CPU Profiling > 2.0 profiler is in use, and allocation-related > features are not disabled via configuration. > For instructions on enabling CPU Profiling 2.0 see the ddtrace > release notes. As long as CPU Profiling 2.0 is in use, this API is enabled by default. To disable it, this PR adds a new setting: ```ruby Datadog.configure do |c| c.profiling.advanced.allocation_counting_enabled = # ... end ``` **Motivation**: This feature has long been something we want to provide with ddtrace, see issues #2164 and #468, as well as PRs #1891, #1805, #597 As part of the ongoing work of enabling allocation profiling, counting the number of allocations comes at a very cheap cost since the profiler needs to have a `RUBY_INTERNAL_EVENT_NEWOBJ` tracepoint anyway -- it's just a matter of also incrementing a counter inside it. **Additional Notes**: Note that this does not yet change any user-visible feature for ddtrace. I'm counting on @marcotc to pick up the task of using this API to make some tracing magic :) **How to test the change?**: This change includes code coverage. --- Fixes #2164
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Motivation
We would like more visibility over the internal overhead of Ruby into our application execution.
It can be useful to know if a web request has created many more objects that all other requests in the same application, for example.
How
Ruby has internal events related to object lifecycle (
NEWOBJ
,FREEOBJ
) that are broadcast to any registered internal tracepoint event listener.Here are all the events available in Ruby 3.1.2:
Here's how to register an internal tracepoint event listener:
One detail to keep in mind is that despite
TracePoint
being a public Ruby class, it is not allowed to traceRUBY_INTERNAL_EVENT_*
events. Some of these events are too high frequency (e.g.NEWOBJ
would be too slow to trace with Ruby code) or happen at sensitive times (GC events don't play well with Ruby objects being allocated during GC), thus not available for Ruby callback tracing, only native tracing.What should be done
We should create native tracepoints to collect object allocation (
RUBY_INTERNAL_EVENT_NEWOBJ
). These numbers should be collected on a per-thread basis, in order to reduce the noise and allow for measuring allocations on a single trace.The number of allocations should be stored on the top-level span.
Another option is to store this number for every span: if the overhead is not significant, this provides more value and makes it easier to pinpoint which span is creating an excessive number of objects, for example.
Other considerations
As you can see above, we have GC internal events as well.
As tempting as they may seem to collect, because GC runs are global, we have enough information in the public
GC.stat
method to gather GC metrics.GC.stat
should be used instead, unless we see value in collecting internal GC details like Mark and Sweep events.The text was updated successfully, but these errors were encountered: