Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be possible for nix to use bindgen instead of depend on libc ? #978

Closed
gnzlbg opened this issue Nov 22, 2018 · 14 comments
Closed

Comments

@gnzlbg
Copy link

gnzlbg commented Nov 22, 2018

The values of constants in the system's C headers, the layout of types like structs, etc. vary across kernel versions. The only way to get these always right is to parse the C system headers, and generate Rust bindings for them automatically.

This is something that libc does not do. In libc, the value of a constant is hardcoded to whatever its value is in the kernel version than its Docker containers use.

Would it be possible for nix to do better than libc here ?

@asomers
Copy link
Member

asomers commented Nov 22, 2018

Nope. This is indeed a big problem. It's even worse in the BSD world than it is in Linux. Unfortunately, Nix can't simply switch to rust-bindgen, because that would break cross-compilation in all of Nix's consumers. It could also theoretically cause runtime breakage in certain situations. If you've found a new example of a Linux structure that changed between two Rust-supported kernel versions, please raise it in libc's issue tracker here: rust-lang/libc#570

@asomers asomers closed this as completed Nov 22, 2018
@posborne
Copy link
Member

@gnzlbg In many cases, you are compiling once for a variety of systems so you won't know what features are supported until runtime anyway. I think the right way to handle this is to get all system calls exposed and have them bubble up errors from the OS if a syscall/ioctl/whatever isn't supported at runtime.

If you have a specific example of incompatibility between kernel versions as a case study, I would be interested to take a look.

@gnzlbg
Copy link
Author

gnzlbg commented Nov 26, 2018

On MacOSX the values of some consts change from version to version :/

@asomers
Copy link
Member

asomers commented Nov 26, 2018

Then submit a PR to remove them from libc. libc can't generate correct code for bindings that change value between versions.

@gnzlbg
Copy link
Author

gnzlbg commented Nov 26, 2018

We can't remove stuff from libc, we can at best deprecate it, and all platforms do backwards incompatible changes when they do "major" version bumps, which happens every couple of years :/

@asomers
Copy link
Member

asomers commented Nov 26, 2018

We can't remove stuff from libc, we can at best deprecate it,

No, libc can remove stuff if it's broken. Buildtime failures are better than runtime failures. But we should discuss it on libc's issue tracker, not Nix's.

and all platforms do backwards incompatible changes when they do "major" version bumps, which happens every couple of years :/

Not true for well-behaved C libraries. A well-behaved C library never makes backwards incompatible changes. Backwards compatibility is maintained by bumping the .so version (Like ncurses did for 6.0), using ELF symbol versioning (like FreeBSD does frequently), or never changing public symbols (like glibc does).

@gnzlbg
Copy link
Author

gnzlbg commented Nov 26, 2018

The problem is it isn't broken, on some platform version, and people rely on it working on that platform.

@asomers
Copy link
Member

asomers commented Nov 26, 2018

But libc is used on more than one platform version. And libc is designed to be always cross-compilable. So such a symbol can't be used correctly.

@gnzlbg
Copy link
Author

gnzlbg commented Nov 26, 2018

The problem remains: they work correctly for the platform they were added to, so removing them is a backwards incompatible change :/ i'm not arguing that this is a nice situation to be in, it isn't, but some constants are very basic :/ platforms just shouldn't break them, but they do :/

@asomers
Copy link
Member

asomers commented Nov 26, 2018

No, they don't work correctly, because those symbols don't match the environment where the application runs. For example, if the system headers expose a symbol called `#define _POSIX_VERSION 201512" and an application uses that symbol to determine what version of the POSIX standard is supported by its operating system, then it will be wrong whenever it's run on a different version of the OS. Such a symbol simply can't be correctly used by a cross-compiled program.

@gnzlbg
Copy link
Author

gnzlbg commented Nov 27, 2018

You seem to be trying really hard to not understand what I am saying: yes, the symbols are broken on all OS versions except for one, but the people who added them are using them on precisely that OS version, they do work there, and that OS version is the one that libc Docker containers use on CI which is the only OS version that libc officially supports. All other OS versions are not tested.

The logic of your argument: "libc works for me on other OSes and OSes versions, therefore libc supports those" is flawed. libc does not support those, libc is not verified to work on those, and if it happens to work for you there that's pure luck, it is not by design, and if libc is designed for something, the const approach shows that it is designed to work for a single OS version.

That is, removing these symbols from libc would break libc for its users on the single OS version that it supports. Those users would actually be right about complaining about breakage, any other user complaining that the const is incorrect in some different OS version its pretty much on its own because libc, as designed and implemented, can only support a single OS version, and the versions where the consts are broken are just not supported, at least right now (and they can't easily be supported because libc lacks the tools for that).

Does this mean that the design of libc could be better? Of course, and I agree with you that supporting multiple OS versions is worth it. However, libc 0.2 cannot break existing users for which libc is correct, so its hands are pretty tied. Should we have payed more attention about what APIs do we actually allow in libc ? Probably, but we didn't, and we have to live with that.

I opened this issue here because libc is just a library, other libraries can choose different better designs, and libstd can already use libraries from crates.io, so it could start using something different than libc in some platforms, or it could switch from libc to something else in the future.

There are a couple of things that, with tied hands, libc actually can do to improve things, and I've put my thoughts on that here. One thing we could do in a backwards compatible way is make libstd actually work on all versions by being extremely careful with which parts of libc libstd does use. This goes in the direction that you proposed in the const dragonfly PR about trying to only export what works everywhere. We can't do that for libc without releasing a breaking version, but we could move such a core to a different library that libc re-exports, and make sure that libstd only uses that. This way, at least libstd, would be version independent. People could write their own portable libcs that can interoperate with libc if we move the ctypes somewhere else that is reusable.

@Susurrus
Copy link
Contributor

Observing this conversation it does seem like you two are talking past each other. I however think that both of your points are valid: libc is broken at an fundamental level and we don't want to use bindgen so we can support cross-compilation.

However, I think there is room for a libc alternative that uses bindgen instead of hard-coded user-added constants like the current libc. This wouldn't support cross-compilation, but should provide the most compatibility with whatever platform it's being built for. I think nix could support that library as well instead of libc, but I don't think any bindgen functionality should exist within nix itself.

@asomers
Copy link
Member

asomers commented Nov 27, 2018

You seem to be trying really hard to not understand what I am saying: yes, the symbols are broken on all OS versions except for one, but the people who added them are using them on precisely that OS version, they do work there, and that OS version is the one that libc Docker containers use on CI which is the only OS version that libc officially supports. All other OS versions are not tested.

Actually, libc makes no claim about which OS versions it supports. But since it's included in libstd, it must support at least as many versions as libstd does. Right now libstd supports OSX 10.7+, Linux 2.6.18+, Windows 7+, and unspecified versions of other OSes. https://forge.rust-lang.org/platform-support.html . Nor is libc written for one specific version. Since different symbols are added by different users, it ends up getting a mix: rust-lang/libc#570 (comment)

@gnzlbg
Copy link
Author

gnzlbg commented Nov 27, 2018

Actually, libc makes no claim about which OS versions it supports.

Yes, this is a big oversight in the libc docs. I'm of the opinion that whatever is not tested, is not really supported, since we can't guarantee that we don't break it, and chances are that we have broken things multiple times in untested platforms.

Right now libstd supports OSX 10.7+, Linux 2.6.18+, Windows 7+, and unspecified versions of other OSes. https://forge.rust-lang.org/platform-support.html .

Note that just because this is written down somewhere doesn't mean it's true. As you have realized, it isn't. In particular, the claim that all OSX versions >= 10.7 are supported is extremely suspicious. Some commonly used const values have changed since OSX 10.7, so upholding this claim would mean that libstd would have to know which OSX version is actually being targeted.

What happens in practice is that libc ends up receiving breaking changes, e.g., which change the values of consts to the ones of the last OSX versions, and since OSX users upgrade often, nobody ends up noticing these (e.g. currently libc OSX build bots all use xcode10 travis images and are tested against OSX 10.13-10.14 only =/).

libc is broken at an fundamental level and we don't want to use bindgen so we can support cross-compilation.

What we could do is instead guarantee that the "default" targets, e.g., x86_64-unknown-linux-gnu target the latest version of the linux kernel (or some LTS of the platform), and add support for a cfg(target_env_version_major, .._minor, .._patch) so that Rust can also target e.g. x86_64-unknown-linux-gnu-3.15.0 (this would require an RFC). Then we'd have to add CI to test libc and libstd against the different versions of each platform.

So essentially what would happen is that different ABI incompatible versions of each platform become different targets, so we can still use libc's approach without requiring rust-bindgen.

This would be a lot of work, for an unpredictable amount of gain. Arguably, if this was a real problem, people would already be working on it.

I think the effort would at least be initially be better spent making libstd as version-independent as possible by ensuring that the tiny part of libc that libstd uses either never changes or precisely identifying the changes.

Observing this conversation it does seem like you two are talking past each other.

I'd like to apologize to @asomers , it's my fault that the discussion got heated. I understand the points that you are making, and I do think that they are real problems worth solving.

The reason things got heated on my side is that I've got a bit frustrated about how the discussion is going. We have many open and approved PRs to libc with good technical solutions to different problems, but which might never land because they contain breaking changes (most have been open for years). I really want to avoid a solution that ends up just like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants