Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use library paths instead of -Lnative/-lstatic when linking native dependencies #841

Merged
merged 26 commits into from
Jul 22, 2021

Conversation

krasimirgg
Copy link
Collaborator

@krasimirgg krasimirgg commented Jul 16, 2021

This fixes #840 by updating the rules to use library paths instead of -Lnative/-lstatic when linking native dependencies.

krasimirgg and others added 17 commits July 9, 2021 14:48
Fixes static library naming issues that cause the build of a rust_binary that
depends on a cc_library that itself depends on a rust_library to fail.

Consider a rust_binary that depends on a cc_library that itself depends
on a rust_library:

```
.
├── bin.rs
├── BUILD
└── lib.rs
```

```rust
// bin.rs
fn main() {
    println!("hi");
}
```

```rust
// lib.rs: empty
```

```bazel
load("@rules_rust//rust:defs.bzl", "rust_benchmark", "rust_binary", "rust_library")

rust_library(
    name = "rust_lib",
    srcs = ["lib.rs"],
)

cc_library(
    name = "cc_lib",
    srcs = [],
    deps = [":rust_lib"],
)

rust_binary(
    name = "bin",
    srcs = ["bin.rs"],
    deps = [":cc_lib"],
)
```

Running `bazel build //project:bin` runs into linker errors like:

```
/usr/bin/ld.gold: error: cannot find -lrust_lib
/usr/bin/ld.gold: error: cannot find -lrustc_std_workspace_alloc-1ff59d4f23b10626
```

This is because the linker expects the static library corresponding to `-lname`
to be named `libname.a`. Also Bazel CC link actions get confused by static
library names that have multiple dots, e.g., end in "libcrate.rlib.a", so
updated the scripts to produce them as "libcrate-rlib.a" instead.
Fixes an issue that causes the build of a rust_binary that depends on a
cc_library that depends on a rust_library to fail.

Consider a rust_binary that depends on a cc_library that itself depends
on a rust_library:

```
.
├── bin.rs
├── BUILD
└── lib.rs
```

```rust
// bin.rs
fn main() {
    println!("hi");
}
```

```rust
// lib.rs: empty
```

```bazel
load("@rules_rust//rust:defs.bzl", "rust_benchmark", "rust_binary", "rust_library")

rust_library(
    name = "rust_lib",
    srcs = ["lib.rs"],
)

cc_library(
    name = "cc_lib",
    srcs = [],
    deps = [":rust_lib"],
)

rust_binary(
    name = "bin",
    srcs = ["bin.rs"],
    deps = [":cc_lib"],
)
```

Running `bazel build //project:bin` runs into linker errors like:

/usr/bin/ld.gold: error: bazel-out/k8-fastbuild/bin/external/rust_linux_x86_64/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-5f5ff14665a8d5c5-rlib.a(panic_unwind-5f5ff14665a8d5c5.panic_unwind.p9ngf95o-cgu.0.rcgu.o): multiple definition of '__rust_panic_cleanup'
          /usr/bin/ld.gold: bazel-out/k8-fastbuild/bin/external/rust_linux_x86_64/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_abort-c6166278e71d9daf-rlib.a(panic_abort-c6166278e71d9daf.panic_abort.4p8k2zko-cgu.0.rcgu.o): previous definition here

That's because we're passing both panic_unwind and panic_abort to the linker.
The standard library by default depends on panic_unwind. This updates the rules
to prefer panic_unwind. It would be nice to have an option for the user to
pick the panic runtime similarly to how it could be done upstream:
https://rustc-dev-guide.rust-lang.org/panic-implementation.html#step-2-the-panic-runtime
I'll add a additional test case there.
This should address the Windows buildbot failure.
@google-cla google-cla bot added the cla: yes label Jul 16, 2021
@krasimirgg krasimirgg changed the title linking a rust_binary that depends on two cc_libraries with the same name Use library paths instead of -Lnative/-lstatic when linking native dependencies Jul 20, 2021
@krasimirgg krasimirgg marked this pull request as ready for review July 20, 2021 10:49
@krasimirgg krasimirgg requested a review from hlopko July 20, 2021 10:49
Copy link
Member

@hlopko hlopko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One question and I'll merge :)

rust/private/rustc.bzl Show resolved Hide resolved
Copy link
Member

@hlopko hlopko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@hlopko
Copy link
Member

hlopko commented Jul 21, 2021

Oh, CI doesn't like the dylibs change :/

Trying out to build them with -Clink-arg produces test binaries that
fail to find the dynamic library at runtime, e.g;
https://buildkite.com/bazel/rules-rust-rustlang/builds/3850#d19409bc-a30a-4944-9f81-e5da5e9d06a6

/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/ec321eb2cc2d0f8f91b676b6d4c66c29/sandbox/linux-sandbox/1458/execroot/rules_rust/bazel-out/k8-fastbuild/bin/external/examples/ffi/rust_calling_c/matrix_dylib_test.launcher.runfiles/rules_rust/external/examples/ffi/rust_calling_c/matrix_dylib_test:
error while loading shared libraries:
bazel-out/k8-fastbuild/bin/_solib_k8/_U@examples_S_Sffi_Srust_Ucalling_Uc_Sc_Cnative_Umatrix_Uso___Uexternal_Sexamples_Sffi_Srust_Ucalling_Uc_Sc/libnative_matrix_so.so:
cannot open shared object file: No such file or directory

Added a TODO to investigate.
@krasimirgg
Copy link
Collaborator Author

Oh, CI doesn't like the dylibs change :/

I'm sorry, I tried it out, noticed the failure and then left the PR in a broken state for a while.
So, as shown by the CI, trying to use this pattern for dylibs causes tests at runtime to fail to find the dynamic libs with errors like this one:

/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/ec321eb2cc2d0f8f91b676b6d4c66c29/sandbox/linux-sandbox/1458/execroot/rules_rust/bazel-out/k8-fastbuild/bin/external/examples/ffi/rust_calling_c/matrix_dylib_test.launcher.runfiles/rules_rust/external/examples/ffi/rust_calling_c/matrix_dylib_test: error while loading shared libraries: bazel-out/k8-fastbuild/bin/_solib_k8/_U@examples_S_Sffi_Srust_Ucalling_Uc_Sc_Cnative_Umatrix_Uso___Uexternal_Sexamples_Sffi_Srust_Ucalling_Uc_Sc/libnative_matrix_so.so: cannot open shared object file: No such file or directory

I'm still interested in pursuing this change for dylibs, but I'll need to dig deeper in the dynamic linking process to figure this out. I left a TODO and will try resolve it in a follow-up PR.

@hlopko hlopko merged commit 4da6858 into bazelbuild:main Jul 22, 2021
@krasimirgg krasimirgg deleted the same_cc_library_names branch July 22, 2021 12:35
@aaliddell
Copy link

Has this changed the order that args get passed to the linker to produce the final executable?

When using rules_rust after this commit, I get link failures when using a rust_binary that transitively depends on a cc_library which itself has dependencies. The cc_library has dependencies on standard libraries, with linker args like -lpthread etc, but rather than the link args being interleaved between objects I instead get -lpthread repeated consecutively many times in the args, followed by the object .a paths all at the end. This causes linking to fail, as the objects from the cc_library need to come before their deps to resolve the symbols, since the order of linking args matters.

It appears that the link opts and the objects they come from are losing order relative to each other. e.g rather than -lobj_a -lobj_a_dep -lobj_b -lobj_b_dep -lobj_c -lobj_c_dep we instead get -lobj_a_dep -lobj_b_dep -lobj_c_dep path/to/obj_a path/to/obj_b path/to/obj_c, which breaks the required linking order.

@krasimirgg
Copy link
Collaborator Author

I did not intend to change the order that args get passed to the linker, just how each native dep is passed.

In your example, how are the dependencies of the cc_library on standard libraries declared? If you can share a sample failing example that would be great. In my setup stuff like -lpthread is handled via a rust_toolchain.stdlib_linkflags declaration, like in this example:

stdlib_linkflags = ["-lpthread", "-ldl"],

Also which linker are you using? I might have instructed my linker to ignore backreferences to the standard libraries via –warn-backrefs-exclude and hence missed this.

@aaliddell
Copy link

So the example I found this with was pretty complex, using the grpc-sys crate, which wraps grpc C/C++ project which depends on pthread etc. The error I was getting was that it couldn't even find the symbol for stat.

A build log is linked below, from this build. The rust specific failure is right at the bottom.

However, since this is such a weird setup, I'll see if I can create a small reproduction that doesn't depend on such big projects.

Build log: rules-proto-grpc-rules-proto-grpc_build_1159_build-and-test-all-on-ubuntu-18-dot-04-openjdk-11.log

@bsilver8192
Copy link
Contributor

+1 for this breaking builds using ld. My situation has -pthread and -lstdc++ coming from the C++ toolchain, and I'm using cxx via autocxx. That means I have this dependency order:

  1. My Rust code
  2. Generated C++ wrappers
  3. C++ code from cxx (this might be redundant if my Rust->C++ API was more complicated and causes the former item to rely directly on the latter item)
  4. Rust code from cxx

This means C++ code must be sandwiched between Rust code in the link order, or the whole thing has to be wrapped in --start-lib/--end-lib or --start-group/--end-group. I think this is what rustc effectively does with this PR reverted, which I confirmed makes my example build.

I don't see a way to fix it with the approach used by this PR: any way of passing a generic linker argument through rustc is going to put the files either before or after all the Rust libraries. If the .a filenames are a problem, maybe generating a folder with them all symlinked/copied with sanitized filenames would work?

You can see my problematic toolchain at https://github.com/frc971/971-Robot-Code/blob/c34b3ea019461766d70b43b41f509faa6b8001f3/tools/cpp/BUILD#L122. I can work on pulling out a smaller example if that would help, but I think the CI build referenced in dtolnay/cxx#947 is a more convenient example.

For reference, I've got some other lld-based toolchain which do similar things with linker arguments at https://github.com/frc971/971-Robot-Code/tree/1e7c897dc01b85ce773c5334a728057621a486b5/third_party/bazel-toolchain (based on https://github.com/grailbio/bazel-toolchain with some changes I haven't upstreamed yet) which do work with this PR.

@krasimirgg
Copy link
Collaborator Author

+1 for this breaking builds using ld

Thank you for the detailed report! I'm looking into this.

krasimirgg added a commit to krasimirgg/rules_rust that referenced this pull request Feb 11, 2022
krasimirgg added a commit to krasimirgg/rules_rust that referenced this pull request Feb 21, 2022
This reverts rules_rust to use `-Lnative/-lstatic` instead of library paths
per
bazelbuild#840 (comment).
hlopko added a commit that referenced this pull request Feb 22, 2022
Co-authored-by: Marcel Hlopko <hlopko@google.com>
@bsilver8192
Copy link
Contributor

Adding a note here for anybody from the future: #1148 (and related PRs) fix this for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

linking a rust_binary that depends on two cc_libraries with the same name fails
5 participants