Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-9926] Fix rpath for linking to libdatadog when loading from extension dir (cherry-pick from 1.x-stable) #3706

Merged

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Jun 12, 2024

What does this PR do?

This is a cherry-pick of #3683 which was first applied to the 1.x-stable branch. See that PR for details.

The diff is slightly different in the spec file since 1.x-stable needed a few tweaks to support older Rubies, which 2.x doesn't need. Everything else is the same.

Motivation:

Fix the rpath issue.

Additional Notes:

N/A

How to test the change?

See #3683 && and additional integration test for this was added on DataDog/prof-correctness#39 .

…ension dir

**What does this PR do?**

This PR is a follow-up to
#3582 .

In that PR, we fixed loading the profiling native extension so that
it could be loaded from the Ruby extensions directory (see the original
PR for more details).

It turns out this was not enough! Specifically, the customer reported
that they saw the following error

> Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
> native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
> due to libdatadog_profiling.so: cannot open shared object file: No such file or directory

Specifically, what this message tells is that we're finding the
profiling native extension BUT it's failing to load BECAUSE the dynamic
loader is not able to find its `libdatadog_profiling.so` dependency.

From debugging the issue with the customer, I suspect that what
we're seeing here is a repeat of
#2067 /
#2125 , that is, the
paths where the profiler is compiled are changed at deployment, and
so we also need to adjust the relative rpath to account for this.

I haven't yet confirmed with the customer that this is their issue,
BUT I was able to reproduce the exact problem if I moved the
installation of the library in the way I mention above (see "how to test
the change", below).

**Motivation:**

Fix this weird corner case that made the profiler not load.

**Additional Notes:**

This is a really really weird corner case, so I'm happy to further
describe what the issue is if my description above + the comments in the
code are still too cryptic to understand.

**How to test the change?**

I've added test code for the helper, but actually validating the whole
rpath thing is a bit annoying.

Here's how I triggered the issue myself, and then used it to validate
the fix:

```
 # Build fixed gem into folder, will be used later
$ bundle exec rake build
datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem.

 # Open a clean Ruby docker installation
$ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash

 # I've created a minimal test gemfile ahead of time
/working/rpathtest# cat gems.rb
source 'https://rubygems.org'

gem 'datadog'
 # Tell bundler to install the gem into a folder
/working/rpathtest# bundle config set --local path 'vendor/bundle'
/working/rpathtest# bundle install

 # Confirm profiler works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the native extension being loaded from the
 # extensions directory:
/working/rpathtest# find | grep \.so$ | grep datadog
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
/working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so

 # Confirm profiler still works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the folders being moved (the issue being fixed):
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor/bundle"
 # Update this to vendor2...
working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor2/bundle"
 # and move the folder
/working/rpathtest# mv vendor/ vendor2

 # Now we've triggered the exact same error message as reported by the
 # customer
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
W, [2024-06-05T15:51:12.488843 #517]  WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>''

 # Now let's test the fix. Let's start by recreating the issue:
 # Put the fixed version into the bundler cache...
/working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem
 # force bundler to reinstall...
working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/
working/rpathtest# bundle install
 # Force gem to be loaded from extension directory
/working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
 # Confirm it works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Let's now change the vendor folder again:
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor3/bundle"
/working/rpathtest# mv vendor2/ vendor3

 # And it now doesn't fail:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # And extra confirmation that the relative paths are working:
/working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
	libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000)
```
@ivoanjo ivoanjo requested a review from a team as a code owner June 12, 2024 10:03
@github-actions github-actions bot added the profiling Involves Datadog profiling label Jun 12, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 96.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.11%. Comparing base (bd08222) to head (3979b55).

Files Patch % Lines
...datadog/profiling/native_extension_helpers_spec.rb 95.65% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3706      +/-   ##
==========================================
- Coverage   98.11%   98.11%   -0.01%     
==========================================
  Files        1225     1225              
  Lines       72806    72836      +30     
  Branches     3482     3485       +3     
==========================================
+ Hits        71431    71460      +29     
- Misses       1375     1376       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ivoanjo ivoanjo merged commit be93f85 into master Jun 12, 2024
167 checks passed
@ivoanjo ivoanjo deleted the ivoanjo/extend-relative-rpath-extensions-folder-master branch June 12, 2024 13:31
@github-actions github-actions bot added this to the 2.2.0 milestone Jun 12, 2024
@TonyCTHsu TonyCTHsu mentioned this pull request Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants