Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive on Darwin embed date #240014

Open
Et7f3 opened this issue Jun 26, 2023 · 12 comments
Open

archive on Darwin embed date #240014

Et7f3 opened this issue Jun 26, 2023 · 12 comments

Comments

@Et7f3
Copy link
Contributor

Et7f3 commented Jun 26, 2023

Building this package twice does not produce the bit-by-bit identical result each time, making it harder to detect CI breaches. You can read more about this at https://reproducible-builds.org/ .

Fixing bit-by-bit reproducibility also has additional advantages, such as avoiding hard-to-reproduce bugs, making content-addressed storage more effective and reducing rebuilds in such systems.

Steps To Reproduce

% nix repl --file nixpkgs
Welcome to Nix 2.15.1. Type :? for help.

Loading installable ''...
Added 18853 variables.
nix-repl> :b runCommandCC "test" {} "echo 'void a(void){}' > a.c; echo 'void b(void){}' > b.c; echo 'void c(void){}' > c.c; echo 'libbug.a(a.o b.o c.o):' > Makefile; make; cp libbug.a $out"

This derivation produced the following outputs:
  out -> /nix/store/yg3d29db02r1klr3cscw8qadrzkxj4hw-test

nix-repl>
% cat /nix/store/yg3d29db02r1klr3cscw8qadrzkxj4hw-test
!<arch>
#1/20           1687814346  301   30000 100644  44        `
__.SYMDEF SORTE_a#1/12           1687814341  301   30000 100644  612       `
a.o����� 8h�h__text__TEXT��__compact_unwind__L �8__eh_frame__TEXT(@�
                                                                    h$

@P
  PUH��]�zRx
_a         �$��������A�C



%

You can see 1687814346 (date of when I run the command) appear in the resulting object. IIRC I have tried a similar example on x86_64-linux and it produced identical archive. So the Darwin ar tool should also be patched to not save the timestamp or respect DATE_SOURCE_EPOCH

@emilytrau
Copy link
Member

cc @NixOS/darwin-maintainers

@uri-canva
Copy link
Contributor

uri-canva commented Jul 6, 2023

Nice to know code I wrote 8 years ago is still relevant: https://github.com/facebook/buck/blob/9c7c421e49f4d92d67321f18c6d1cd90974c77c4/src/com/facebook/buck/cxx/toolchain/objectfile/ObjectFileScrubbers.java.

The logic is pretty simple, do we have a pattern in nixpkgs for hooks that postprocess files to make them deterministic?

@uri-canva
Copy link
Contributor

I mean something more specific than just regular hooks.

@uri-canva
Copy link
Contributor

If not I guess we can add it as a hook on darwin stdenv.

@uri-canva
Copy link
Contributor

By the way I wrote a whole bunch of code to make outputs of darwin toolchains deterministic as part of Buck, feel free to cc me on any other relevant issues. I might not have time to do it myself but I can point you to the relevants part of the Buck source code / apple toolchain source code / various docs.

@Et7f3
Copy link
Contributor Author

Et7f3 commented Jul 6, 2023

I thought about a patch of ld so it store directly a null timestamp. Instead of postprocessing it (and maybe forget one place to call the hooks). Cool to see buck take care of such details.

@reckenrode
Copy link
Contributor

Darwin uses llvm-ar since #240433 was merged, which supports zeroing out timestamps and should do so by default. When I execute the test on current staging, the result comes back with zeroed out timestamps as expected:

$ nix repl -f .
Welcome to Nix 2.11.1. Type :? for help.

Loading installable ''...
Added 19591 variables.
nix-repl> :b runCommandCC "test" {} "echo 'void a(void){}' > a.c; echo 'void b(void){}' > b.c; echo 'void c(void){}' > c.c; echo 'libbug.a(a.o b.o c.o):' > Makefile; make; cp libbug.a $out"


This derivation produced the following outputs:
  out -> /nix/store/hbr16f8clm72fd49bms0knzkpznck59c-test

nix-repl>

$  cat /nix/store/hbr16f8clm72fd49bms0knzkpznck59c-test
!<arch>
#1/12           0           0     0     0       36        `
__.SYMDEh_a#1/4            0           0     0     644     508       `
a.o����
       h �(�(__text__TEXT��__compact_unwind__L ��2
                                                  ��
                                                    P�_�
_altmp1ltmp0

@uri-canva
Copy link
Contributor

If you only want that you can pass the ZERO_AR_DATE environment variable: https://github.com/opensource-apple/cctools/blob/fdb4825f303fd5c0751be524babd32958181b3ed/ar/archive.c#L323C15-L323C27.

But it's still possible to have non determinism because of the uid and gid, especially when using the nix daemon. I don't think non-determinism because of the file mode is an issue since we don't support nix store on weird filesystems that don't return sensible modes.

Also if we use an old version of ar, we might have to work around a bug that was fixed back in Xcode 8, but I doubt even our oldest apple sdk / stdenvs are that old: facebook/buck@55d4678.

@reckenrode
Copy link
Contributor

. is my GHC testing branch, which is staging plus #241692 and some patches to fix GHC with the LLVM 16 stdenv.

@reckenrode
Copy link
Contributor

But it's still possible to have non determinism because of the uid and gid, especially when using the nix daemon. I don't think non-determinism because of the file mode is an issue since we don't support nix store on weird filesystems that don't return sensible modes.

I can’t speak to GNU ar, but llvm-ar zeroes out these as well.

@uri-canva
Copy link
Contributor

@reckenrode
Copy link
Contributor

Note that Darwin does not use llvm-ranlib because it’s not a drop-in replacement. It should work, but some build processes (like Qt’s) would have to be patched not to assume Apple’s ranlib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants