Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binary cache 404 in dependency fetch causes loop #245

Closed
roberth opened this issue Aug 26, 2020 · 6 comments
Closed

binary cache 404 in dependency fetch causes loop #245

roberth opened this issue Aug 26, 2020 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@roberth
Copy link
Member

roberth commented Aug 26, 2020

Description

Log shows the same two dependencies being fetched over and over.
Specifically, the subtree of a dependent of the missing path.

P -> A -> B

x -> y: x needs y
P: what the agent is trying to build
A: a path that is in a binary cache
B: a path that is not in any binary cache

It will keep trying to fetch A and B.

I'm investigating whether this could be related to broken C++ exception handling https://gitlab.haskell.org/ghc/ghc/-/issues/11829

Agent 0.6 may be unaffected but other fixes have not been backported there and it does not have live logs.

To Reproduce

  1. have a missing path in the binary cache
  2. build something that depends on it, on a darwin agent

Expected behavior

Exception is caught, dependency derivation is built as fallback.

Logs

querying info about '/nix/store/fv3bh16qqbh78j1m11dgirlnilaizxvh-bimap-0.3.3' on 'https://cache.nixos.org'
downloading 'https://cache.nixos.org/fv3bh16qqbh78j1m11dgirlnilaizxvh.narinfo'
querying info about '/nix/store/fmd681801i64b5869b7rrsqyd1l76kc9-bimap-0.3.3-doc' on 'https://cache.nixos.org'
downloading 'https://cache.nixos.org/fmd681801i64b5869b7rrsqyd1l76kc9.narinfo'
querying info about '/nix/store/fmd681801i64b5869b7rrsqyd1l76kc9-bimap-0.3.3-doc' on 'some-private-cache'

Only fv3... exists, in the private cache.

Platform / Version

darwin, 0.7.4

@roberth roberth added the bug Something isn't working label Aug 26, 2020
@roberth roberth self-assigned this Aug 26, 2020
@roberth roberth changed the title darwin: 404 in dependency fetch causes loop binary cache 404 in dependency fetch causes loop Aug 26, 2020
@roberth
Copy link
Member Author

roberth commented Aug 26, 2020

This can be reproduced with Nix only, so it's probably a bug in Nix's goal state machine in build.cc, triggered by an incomplete output.
So we have a derivation with two outputs, out and doc, where out will reference doc.
They have been built, but doc has been removed from the cache or was never necessary to be uploaded (for example the cache is only used for binaries)

  1. Something needs out to be valid
  2. out narinfo is fetched. out depends on doc
  3. doc narinfo is fetched; is missing
  4. doc needs to be built. Build the drv
  5. drv may be substitutable, let's fetch out. go to 2.

@jappeace
Copy link

jappeace commented Sep 25, 2020

Did you report it in nix? We saw this behavior in our CI as well on agent version 0.7.3.

Also, how do we work around it?

@roberth
Copy link
Member Author

roberth commented Sep 25, 2020

Here's the Nix issue NixOS/nix#3964. So far I've assumed it was a one-off, but that doesn't seem to be the case.

To work around the issue, you could build it manually without the "broken" cache.

nix-store -r --option substituters https://cache.nixos.org /nix/store/....drv

If you have a single agent per architecture, you could run it there and have the agent pick up the output when you click Rebuild.
Alternatively, you could upload the outputs to the cache with cachix push (nix-copy-closure for those who don't use Cachix)

@roberth
Copy link
Member Author

roberth commented Sep 25, 2020

@jappeace Could you provide the derivation path? You can send it to support@hercules-ci.com if you prefer.

@jappeace
Copy link

jappeace commented Sep 25, 2020

I send an email 👌

@roberth
Copy link
Member Author

roberth commented Sep 29, 2020

Update: Domen has improved Cachix to better avoid this bug.

I'm closing this in favor of NixOS/nix#3964 but feel free to comment or contact support if this recurs.

@roberth roberth closed this as completed Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants