Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling disabled due to missing extension on Heroku and Elastic Beanstalk (ddtrace 1.1.0) #2067

Closed
theocodes opened this issue Jun 6, 2022 · 16 comments · Fixed by #2125
Closed
Assignees
Labels
profiling Involves Datadog profiling

Comments

@theocodes
Copy link

Hi 👋

Came across the following while attempting to set up profilingf or a rack app

W, [2022-06-06T10:16:18.000528 #130]  WARN -- ddtrace: [ddtrace] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load ddtrace_profiling_native_extension.2.7.6_x86_64-linux due to libddprof_ffi.so: cannot open shared object file: No such file or directory' at '/app/vendor/bundle/ruby/2.7.0/gems/ddtrace-1.1.0/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''

After some digging I can see this (at least the error message) may be caused by the 1.1.0 release which added libddprof as a dependency. However, it's not clear to me whether this is intended or a bug..

The PR Seems to suggest x86_64 linux is supported and that is exactly what I am using and yet I get the error.

Would youf folks be able to shed some light on it?

Thanks!

@ivoanjo ivoanjo self-assigned this Jun 6, 2022
@ivoanjo ivoanjo added the profiling Involves Datadog profiling label Jun 6, 2022
@ivoanjo
Copy link
Member

ivoanjo commented Jun 7, 2022

Hey @theocodes thanks for the report and interest in the Profiler!

This is definitely not expected, as indeed x86_64 Linux is fully supported. Ever since the PR you mentioned (#2028) the profiler gets compiled and linked to the libddprof library, which is shipped in the libddprof gem.

The weird part is that the error you shared seems to indicate that the profiler did get successfully compiled and linked during installation (otherwise the error message would be different), so whatever went wrong, it went wrong after ddtrace was installed.

In particular, it looks like the libddprof gem was present at installation time, but seemed to not be found anymore at execution time, hence the libddprof_ffi.so: cannot open shared object file: No such file or directory that you reported.

  • 1️⃣ Could you share the output of running the following command:
    ldd `gem which ddtrace_profiling_native_extension.2.7.6_x86_64-linux.so` ?

    This should help us figure out where the libddprof_ffi.so was expected to be found in your installation.

  • 2️⃣ Could you share a few more details about how your app and its dependencies gets installed and deployed? Hopefully I can reproduce the issue on my side as well.

@theocodes
Copy link
Author

Hi @ivoanjo

Thanks for looking into this!

I'm not sure what went wrong there but I'm unable to reproduce it now - everything seems to be working.. 🤔

Maybe let's close this issue for now and I'll let you folks know if it happens again?

@ivoanjo
Copy link
Member

ivoanjo commented Jun 7, 2022

Great! Feel free to reach out if there's anything we can do to help, and happy profiling :)

@Baron-burton
Copy link

Hey @ivoanjo, I've hit this error myself today.

I get the following output from the command you suggested to run:

linux-vdso.so.1 (0x00007fff69f3e000)
libruby.so.3.1 => not found
libddprof_ffi.so => not found
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1ff6939000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1ff6747000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1ff6a97000)

Currently running Ruby 3.1.2

@theocodes theocodes reopened this Jun 8, 2022
@Baron-burton
Copy link

I've tested reverting ddtrace back to 1.0.0 and this gets rid of the errors.

I mentioned that we're running Ruby 3.1.2 but we're also running Rails 7.0.3

@ivoanjo
Copy link
Member

ivoanjo commented Jun 9, 2022

Thanks @Baron-burton for the extra info. Yes, this issue was the result of a new change in 1.1, but obviously not intended to break ;)

The output you shared is quite interesting/helpful, since it does confirm that whatever paths were observed during installation to reach libruby.so.3.1 and libddprof_ffi.so don't seem to be there anymore.

So I can investigate further, could you provide the output of the following:

  • 1️⃣ ldd `which ruby`
    Usually Ruby is installed so that the ruby command is just a tiny executable that is linked to libruby.so, so this should print where it can be found

  • 2️⃣ gem contents libddprof
    This will show where the libddprof gem (a dependency of ddtrace) files are installed in your system

  • 3️⃣ strings `gem which ddtrace_profiling_native_extension.3.1.2_x86_64-linux.so` | grep -e "lib\(ruby\|ddprof\)"
    This command may need to be tweaked to your exact Ruby version; it will print the paths that were used when installing the profiling native extension.

And a couple of questions:

  • 4️⃣ Do you happen to be using a public Ruby build or docker image that you could share? That would help a lot in reproducing this issue

  • 5️⃣ If the above isn't possible, could you share some details on how you're installing your dependencies and deploying your application?

Thanks again, and I'll get to the bottom of this issue soon :)

@Baron-burton
Copy link

Hey @ivoanjo, thanks for getting back to me

1️⃣ ldd which ruby

linux-vdso.so.1 (0x00007fff9ebfe000)
libruby.so.3.1 => not found
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fb4d524d000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb4d522a000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb4d5220000)
libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007fb4d519c000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb4d5194000)
libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007fb4d5159000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb4d500a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb4d4e18000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb4d5276000)

2️⃣ gem contents libddprof

/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/lib/libddprof.rb
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/lib/libddprof/version.rb
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/LICENSE
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/LICENSE-3rdparty.yml
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/NOTICE
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/include/ddprof/ffi.h
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/lib/libddprof_ffi.so
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux-musl/libddprof-x86_64-alpine-linux-musl/lib/pkgconfig/ddprof_ffi_with_rpath.pc
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/LICENSE
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/LICENSE-3rdparty.yml
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/NOTICE
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/include/ddprof/ffi.h
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/lib/libddprof_ffi.so
/app/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/lib/pkgconfig/ddprof_ffi_with_rpath.pc

3️⃣ strings gem which ddtrace_profiling_native_extension.3.1.2_x86_64-linux.so | grep -e "lib(ruby|ddprof)"

libruby.so.3.1
libddprof_ffi.so
/tmp/build_9b486d62/vendor/ruby-3.1.2/lib:/tmp/build_9b486d62/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/lib/pkgconfig/../../lib:${ORIGIN}/../lib

4️⃣ Ruby installation

We rely on Heroku to install our Ruby version. This is done by simply specifying the version in our Gemfile, Heroku picks it up from there and installs the required version.

@ivoanjo
Copy link
Member

ivoanjo commented Jun 9, 2022

This is great! Clearly the gems are in a different folder during installation:

  • libruby.so.3.1 seems to be in /tmp/build_9b486d62/vendor/ruby-3.1.2/lib
  • libddprof_ffi.so is in /tmp/build_9b486d62/vendor/bundle/ruby/3.1.0/gems/libddprof-0.6.0.1.0-x86_64-linux/vendor/libddprof-0.6.0/x86_64-linux/libddprof-x86_64-unknown-linux-gnu/lib/pkgconfig/../../lib

But afterwards things shift around. I think I have enough to work on a fix and report back, please hang tight :)

@Baron-burton
Copy link

Thanks so much @ivoanjo 💪

@ivoanjo ivoanjo changed the title Profiling disabled due to missing extension Profiling disabled due to missing extension on Heroku (ddtrace 1.1.0) Jun 13, 2022
@ivoanjo
Copy link
Member

ivoanjo commented Jun 13, 2022

Just a quick update that I was able to reproduce this issue. It happens due to the way Heroku builds gems during deployment and then moves them to a different folder afterwards.

For now I recommend going back to ddtrace 1.0.0 if you need profiling. I am working on a fix for this issue, which I expect will be in the next ddtrace release (1.2.0).

@seuros
Copy link
Contributor

seuros commented Jun 30, 2022

Do we have any timeline when 1.2.0 will be released ?

Btw : This issue happens in AWS Elastic Beanstalks too.

@ivoanjo
Copy link
Member

ivoanjo commented Jun 30, 2022

Thanks for the patience! We expect to have it out within the next two weeks. Fixing this and getting 1.2.0 released is at the top of the priority list.

Btw : This issue happens in AWS Elastic Beanstalks too.

Interesting, I did not know that! I've been able to reproduce the issue on Heroku, and I'll additionally validate that the fix also works on Elastic Beanstalk.

@ivoanjo ivoanjo changed the title Profiling disabled due to missing extension on Heroku (ddtrace 1.1.0) Profiling disabled due to missing extension on Heroku and Elastic Beanstalk (ddtrace 1.1.0) Jun 30, 2022
ivoanjo added a commit that referenced this issue Jul 5, 2022
…anstalk

As reported in #2067, in these environments ddtrace (and libddprof)
are moved after installation, which broke linking from the
profiling native extension to libddprof.

As a fix/"workaround", we additionally add the relative path
between both gems while linking; see the comments on the
`.libddprof_folder_relative_to_native_lib_folder` helper
for more details and how this works.

Note that key word above is **aditionally** -- e.g., we're
adding more paths in which to find libddprof, and keeping
the existing absolute path, so this should not impact
any setups where things were already working fine.

Big and special thanks to @sanchda for brainstorming with me
on this issue.

Fixes #2067
@ivoanjo
Copy link
Member

ivoanjo commented Jul 7, 2022

Thanks folks for the patience so far! I've merged in a fix for this, and the plan is to release it as part of ddtrace 1.2.0 next week.

@ivoanjo
Copy link
Member

ivoanjo commented Jul 11, 2022

👋 @Baron-burton @seuros the fix for this issue has been released in v1.2.0.
Thanks for the help with the investigation! 🙇

@seuros
Copy link
Contributor

seuros commented Jul 13, 2022

@ivoanjo : now with the update i get Unable to report profile. Cause: wrong argument type nil (expected String) Location: /var/cache/bundle/ruby/3.0.0/gems/ddtrace-1.2.0/lib/datadog/profiling/http_transport.rb:115:in _native_do_export'

same settings worked in 1.0.0

@ivoanjo
Copy link
Member

ivoanjo commented Jul 13, 2022

@seuros damn, big apologies for disappointing you twice! I've been reviewing the code and have a few suspicions; I've opened #2151 to track this issue.

ivoanjo added a commit that referenced this issue Jun 5, 2024
…ension dir

**What does this PR do?**

This PR is a follow-up to
#3582 .

In that PR, we fixed loading the profiling native extension so that
it could be loaded from the Ruby extensions directory (see the original
PR for more details).

It turns out this was not enough! Specifically, the customer reported
that they saw the following error

> Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
> native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
> due to libdatadog_profiling.so: cannot open shared object file: No such file or directory

Specifically, what this message tells is that we're finding the
profiling native extension BUT it's failing to load BECAUSE the dynamic
loader is not able to find its `libdatadog_profiling.so` dependency.

From debugging the issue with the customer, I suspect that what
we're seeing here is a repeat of
#2067 /
#2125 , that is, the
paths where the profiler is compiled are changed at deployment, and
so we also need to adjust the relative rpath to account for this.

I haven't yet confirmed with the customer that this is their issue,
BUT I was able to reproduce the exact problem if I moved the
installation of the library in the way I mention above (see "how to test
the change", below).

**Motivation:**

Fix this weird corner case that made the profiler not load.

**Additional Notes:**

This is a really really weird corner case, so I'm happy to further
describe what the issue is if my description above + the comments in the
code are still too cryptic to understand.

**How to test the change?**

I've added test code for the helper, but actually validating the whole
rpath thing is a bit annoying.

Here's how I triggered the issue myself, and then used it to validate
the fix:

```
 # Build fixed gem into folder, will be used later
$ bundle exec rake build
datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem.

 # Open a clean Ruby docker installation
$ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash

 # I've created a minimal test gemfile ahead of time
/working/rpathtest# cat gems.rb
source 'https://rubygems.org'

gem 'datadog'
 # Tell bundler to install the gem into a folder
/working/rpathtest# bundle config set --local path 'vendor/bundle'
/working/rpathtest# bundle install

 # Confirm profiler works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the native extension being loaded from the
 # extensions directory:
/working/rpathtest# find | grep \.so$ | grep datadog
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
/working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so

 # Confirm profiler still works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the folders being moved (the issue being fixed):
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor/bundle"
 # Update this to vendor2...
working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor2/bundle"
 # and move the folder
/working/rpathtest# mv vendor/ vendor2

 # Now we've triggered the exact same error message as reported by the
 # customer
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
W, [2024-06-05T15:51:12.488843 #517]  WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>''

 # Now let's test the fix. Let's start by recreating the issue:
 # Put the fixed version into the bundler cache...
/working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem
 # force bundler to reinstall...
working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/
working/rpathtest# bundle install
 # Force gem to be loaded from extension directory
/working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
 # Confirm it works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Let's now change the vendor folder again:
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor3/bundle"
/working/rpathtest# mv vendor2/ vendor3

 # And it now doesn't fail:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # And extra confirmation that the relative paths are working:
/working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
	libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000)
```
ivoanjo added a commit that referenced this issue Jun 5, 2024
…ension dir

**What does this PR do?**

This PR is a follow-up to
#3582 .

In that PR, we fixed loading the profiling native extension so that
it could be loaded from the Ruby extensions directory (see the original
PR for more details).

It turns out this was not enough! Specifically, the customer reported
that they saw the following error

> Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
> native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
> due to libdatadog_profiling.so: cannot open shared object file: No such file or directory

Specifically, what this message tells is that we're finding the
profiling native extension BUT it's failing to load BECAUSE the dynamic
loader is not able to find its `libdatadog_profiling.so` dependency.

From debugging the issue with the customer, I suspect that what
we're seeing here is a repeat of
#2067 /
#2125 , that is, the
paths where the profiler is compiled are changed at deployment, and
so we also need to adjust the relative rpath to account for this.

I haven't yet confirmed with the customer that this is their issue,
BUT I was able to reproduce the exact problem if I moved the
installation of the library in the way I mention above (see "how to test
the change", below).

**Motivation:**

Fix this weird corner case that made the profiler not load.

**Additional Notes:**

This is a really really weird corner case, so I'm happy to further
describe what the issue is if my description above + the comments in the
code are still too cryptic to understand.

**How to test the change?**

I've added test code for the helper, but actually validating the whole
rpath thing is a bit annoying.

Here's how I triggered the issue myself, and then used it to validate
the fix:

```
 # Build fixed gem into folder, will be used later
$ bundle exec rake build
datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem.

 # Open a clean Ruby docker installation
$ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash

 # I've created a minimal test gemfile ahead of time
/working/rpathtest# cat gems.rb
source 'https://rubygems.org'

gem 'datadog'
 # Tell bundler to install the gem into a folder
/working/rpathtest# bundle config set --local path 'vendor/bundle'
/working/rpathtest# bundle install

 # Confirm profiler works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the native extension being loaded from the
 # extensions directory:
/working/rpathtest# find | grep \.so$ | grep datadog
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
/working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so

 # Confirm profiler still works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the folders being moved (the issue being fixed):
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor/bundle"
 # Update this to vendor2...
working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor2/bundle"
 # and move the folder
/working/rpathtest# mv vendor/ vendor2

 # Now we've triggered the exact same error message as reported by the
 # customer
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
W, [2024-06-05T15:51:12.488843 #517]  WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>''

 # Now let's test the fix. Let's start by recreating the issue:
 # Put the fixed version into the bundler cache...
/working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem
 # force bundler to reinstall...
working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/
working/rpathtest# bundle install
 # Force gem to be loaded from extension directory
/working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
 # Confirm it works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Let's now change the vendor folder again:
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor3/bundle"
/working/rpathtest# mv vendor2/ vendor3

 # And it now doesn't fail:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # And extra confirmation that the relative paths are working:
/working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
	libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000)
```
ivoanjo added a commit that referenced this issue Jun 12, 2024
…ension dir

**What does this PR do?**

This PR is a follow-up to
#3582 .

In that PR, we fixed loading the profiling native extension so that
it could be loaded from the Ruby extensions directory (see the original
PR for more details).

It turns out this was not enough! Specifically, the customer reported
that they saw the following error

> Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
> native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
> due to libdatadog_profiling.so: cannot open shared object file: No such file or directory

Specifically, what this message tells is that we're finding the
profiling native extension BUT it's failing to load BECAUSE the dynamic
loader is not able to find its `libdatadog_profiling.so` dependency.

From debugging the issue with the customer, I suspect that what
we're seeing here is a repeat of
#2067 /
#2125 , that is, the
paths where the profiler is compiled are changed at deployment, and
so we also need to adjust the relative rpath to account for this.

I haven't yet confirmed with the customer that this is their issue,
BUT I was able to reproduce the exact problem if I moved the
installation of the library in the way I mention above (see "how to test
the change", below).

**Motivation:**

Fix this weird corner case that made the profiler not load.

**Additional Notes:**

This is a really really weird corner case, so I'm happy to further
describe what the issue is if my description above + the comments in the
code are still too cryptic to understand.

**How to test the change?**

I've added test code for the helper, but actually validating the whole
rpath thing is a bit annoying.

Here's how I triggered the issue myself, and then used it to validate
the fix:

```
 # Build fixed gem into folder, will be used later
$ bundle exec rake build
datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem.

 # Open a clean Ruby docker installation
$ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash

 # I've created a minimal test gemfile ahead of time
/working/rpathtest# cat gems.rb
source 'https://rubygems.org'

gem 'datadog'
 # Tell bundler to install the gem into a folder
/working/rpathtest# bundle config set --local path 'vendor/bundle'
/working/rpathtest# bundle install

 # Confirm profiler works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the native extension being loaded from the
 # extensions directory:
/working/rpathtest# find | grep \.so$ | grep datadog
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
/working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so

 # Confirm profiler still works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the folders being moved (the issue being fixed):
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor/bundle"
 # Update this to vendor2...
working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor2/bundle"
 # and move the folder
/working/rpathtest# mv vendor/ vendor2

 # Now we've triggered the exact same error message as reported by the
 # customer
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
W, [2024-06-05T15:51:12.488843 #517]  WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>''

 # Now let's test the fix. Let's start by recreating the issue:
 # Put the fixed version into the bundler cache...
/working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem
 # force bundler to reinstall...
working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/
working/rpathtest# bundle install
 # Force gem to be loaded from extension directory
/working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
 # Confirm it works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Let's now change the vendor folder again:
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor3/bundle"
/working/rpathtest# mv vendor2/ vendor3

 # And it now doesn't fail:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # And extra confirmation that the relative paths are working:
/working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
	libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants