Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create content hashes of upstream crates and detect when they change #10207

Closed
brson opened this issue Oct 31, 2013 · 4 comments
Closed

Create content hashes of upstream crates and detect when they change #10207

brson opened this issue Oct 31, 2013 · 4 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries
Milestone

Comments

@brson
Copy link
Contributor

brson commented Oct 31, 2013

This proposal is a vast simplification of our (non-operational) binary versioning scheme, with the premise that it is just too difficult for Rust binaries to be upgraded in place, and that any upstream changes to binaries must force a downstream rebuild. The goal is to eliminate all cases of 'def-id' drift and similar problems relating to incompatible binaries.

Our current name mangling and type hashing scheme is an attempt (unsuccessful) to allow libraries to be upgraded without rebuilding downstream libraries and executables. With the nature of, and prevalance of, Rust's generics invalidating downstream code, I suggest that such binary forward compatibility is so difficult and applicable to so few code-bases that it is not even worth attempting, at least on the short term.

At the same time, we have frequent problems in servo where crates get out of sync in some way, possibly due to incorrect makefile rules, and cause mysterious metadata errors.

I suggest we add code directly into our tooling to detect when any change is made to upstream crates and invalidate the build. This will guarantee that, at any point in the build process, all binaries are known to be the exact version they were built against - there is no 'def id drift' or drift of any kind.

A vague scheme for doing this:

  • The Strict Version Hash (SVH) uniquely identifies a build of a crate
  • The SVH of a crate is the hash of the AST + all its upstream crates
  • A crate stores its own SVH + the SVH of its upstream crates
  • During crate resolution the SVH of crates must unify in some undetermined way. This prevents rustc from using incompatible, expired def_ids.
  • Every time rustpkg does a build it traverses the DAG, resolving crates, and if it finds that upstream binaries' SVH has changed it forces a rebuild. This will allow, e.g. all rust packages to be automatically rebuilt when std changes, which is important for upgrading.

cc #10188 #2166 #9878

@alexcrichton
Copy link
Member

In the super-ideal world, then the SVH is actually only based on what external crates can touch. If I have a truly internal implementation detail that changes, then the SVH shouldn't necessarily change (and likewise if I change documentation it shouldn't change the SVH). This would mean that the SVH is only dependent on reachable items from the crate.

This is very difficult to generate reliably, however, and I think that for now we should literally call crate.hash() to get the hash value of the crate. Using reachability for the hash is a very far-future thing that would be nice to have, but is totally not necessary. This would in theory truly allow for in-place binary upgrades.

@pnkfelix
Copy link
Member

pnkfelix commented Nov 7, 2013

Accepted for P-backcompat-lang. We need to figure out whether to do this for 1.0.

@lambda-fairy
Copy link
Contributor

In GHC Haskell, the crate hash is derived from the ABI -- this includes inlinable definitions, as well as the signatures of exported functions and types.

Is this what we're aiming for in Rust?

@alexcrichton
Copy link
Member

Yes, we are not there yet though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries
Projects
None yet
Development

No branches or pull requests

4 participants