Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Freshness from DependencyQueue #6832

Merged
merged 8 commits into from
Apr 11, 2019
Merged

Conversation

alexcrichton
Copy link
Member

Ever since the inception of Cargo and the advent of incremental
compilation at the crate level via Cargo, Cargo has tracked whether it
needs to recompile something at a unit level in its "dependency queue"
which manages when items are ready for execution. Over time we've fixed
lots and lots of bugs related to incremental compilation, and perhaps
one of the most impactful realizations was that the model Cargo started
with fundamentally doesn't handle interrupting Cargo halfway through and
resuming the build later.

The previous model relied upon implicitly propagating "dirtiness" based
on whether the one of the dependencies of a build was rebuilt or not.
This information is not available, however, if Cargo is interrupted and
resumed (or performs a subset of steps and then later performs more).
We've fixed this in a number of places historically but the purpose of
this commit is to put a nail in this coffin once and for all.

Implicit propagation of whether a unit is fresh or dirty is no longer
present at all. Instead Cargo should always know, irrespective of it's
in-memory state, whether a unit needs to be recompiled or not. This
commit actually turns up a few bugs in the test suite, so later commits
will be targeted at fixing this.

Note that this required a good deal of work on the fingerprint module
to fix some longstanding bugs (like #6780) and some serious hoops had to
be jumped through for others (like #6779). While these were fallout from
this change they weren't necessarily the primary motivation, but rather
to help make fingerprints a bit more straightforward in what's an
already confusing system!

Closes #6780

@rust-highfive
Copy link

r? @nrc

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 8, 2019
@alexcrichton
Copy link
Member Author

r? @ehuss

Note that this is still WIP a bit due to a failure in simulated_docker_deps_stay_cached, and it definitely looks like a legit bug. I'm still digging into see what's going on there. If there's some early feedback on this though I'm all ears!

@rust-highfive rust-highfive assigned ehuss and unassigned nrc Apr 8, 2019
@ehuss
Copy link
Contributor

ehuss commented Apr 9, 2019

(This looks great, btw.)

I think I see why simulated_docker_deps_stay_cached is failing.

Previously it was only keeping a LocalFingerprint::mtime of the invoked.timestamp file of the RunCustomBuild unit (instead of tracking in Fingerprint::deps). This worked because the invoked.timestamp could only go backwards in time on Docker, so the hash would fail, but the compare() function for LocalFingerprint::mtime would see it as Fresh (since it checks on_disk > previously_built). There's a comment at the bottom of compare to that effect.

With this, the RunCustomBuild is included in deps as a hash value, and there's no nuance to the hash value, it's either correct or it isn't. With the nanosecond change, the hash fails, but there's no compare fallback for Fingerprint::deps so it ends up Dirty.

I'm not sure what a good fix would be. Maybe DepFingerprint serializer could include the entire Fingerprint instead of just the hash, and have Fingerprint::compare recurse into those? Might be a little more expensive, though. Would probably also break when the memoized hash is updated (since the local list doesn't get updated for build scripts).

Could also maybe disable the LocalFingerprint::mtime for rerun-if-changed for non-path dependencies, although I'd be concerned about rerun-if entries for files outside of the package root. I'm not sure how common that is.

@alexcrichton
Copy link
Member Author

Ah right yeah that makes sense. So this is where we relied on dirty propagation through the dependency queue to correctly recompile libraries after the build script ran. With that removed we regained that feature by tracking hashes, but the hashes are too brittle here. Both before and after we correctly deduce that the hash of a build script run has changed but it doesn't need to be recompiled after compare. After though the differing hash propagates upwards and wreaks havoc.

TBH I found it very difficult to grapple with how the compare function had a fall-through. I would ideally like to remove that and return a more bland error at the end of the function, that feels more right to me at least.

I think a better solution to this might be to completely remove mtime information from hashes. In theory all fingerprints should be 'this file relative to that file', and I wonder if we can do dirty propagation in the initial computation of the Fingerprint like inputs_missing for now. I'll play around with this...

@alexcrichton
Copy link
Member Author

Oh I meant to say as well, IIRC when I first added all the JSON business if we did a deep serialization of fingerprints it actually resulted in gigabytes of json for servo because so many dependencies were repeated so many times!

@alexcrichton
Copy link
Member Author

Alright well it's not a patch to fingerprints in Cargo unless it touches more than 1kloc!

I believe all tests should be passing now. The general strategy I settled on was to basically ban FileTime being hashed in Fingerprint. We already did this for absolute paths some time ago, and similar treatment is now given for mtimes. Instead of storing/hashing actual FileTime values we instead store paths that are supposed to be looked at, and then each time we dynamically look at those paths to learn about the relationship between them and whether something is stale or not.

This should neatly solve the docker case and did indeed naturally fix it quickly! It did result in some more invasive refactorings though which are all here now.

We do have one final instance of FileTime being hashed, and that's here where we take the FileTime, stringify it, and later hash it. I've separated this out to its own issue though.

So this should now be good to go!

r? @ehuss

tests/testsuite/freshness.rs Outdated Show resolved Hide resolved
Ever since the inception of Cargo and the advent of incremental
compilation at the crate level via Cargo, Cargo has tracked whether it
needs to recompile something at a unit level in its "dependency queue"
which manages when items are ready for execution. Over time we've fixed
lots and lots of bugs related to incremental compilation, and perhaps
one of the most impactful realizations was that the model Cargo started
with fundamentally doesn't handle interrupting Cargo halfway through and
resuming the build later.

The previous model relied upon implicitly propagating "dirtiness" based
on whether the one of the dependencies of a build was rebuilt or not.
This information is not available, however, if Cargo is interrupted and
resumed (or performs a subset of steps and then later performs more).
We've fixed this in a number of places historically but the purpose of
this commit is to put a nail in this coffin once and for all.

Implicit propagation of whether a unit is fresh or dirty is no longer
present at all. Instead Cargo should always know, irrespective of it's
in-memory state, whether a unit needs to be recompiled or not. This
commit actually turns up a few bugs in the test suite, so later commits
will be targeted at fixing this.

Note that this required a good deal of work on the `fingerprint` module
to fix some longstanding bugs (like rust-lang#6780) and some serious hoops had to
be jumped through for others (like rust-lang#6779). While these were fallout from
this change they weren't necessarily the primary motivation, but rather
to help make `fingerprints` a bit more straightforward in what's an
already confusing system!

Closes rust-lang#6780
This has proven to be a very unreliable piece of information to hash, so
let's not! Instead we track what files are supposed to be relative to,
and we check both mtimes when necessary.
src/cargo/core/compiler/fingerprint.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/fingerprint.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/fingerprint.rs Show resolved Hide resolved
src/cargo/core/compiler/fingerprint.rs Show resolved Hide resolved
src/cargo/core/compiler/fingerprint.rs Outdated Show resolved Hide resolved
@alexcrichton
Copy link
Member Author

Should be updated!

@ehuss
Copy link
Contributor

ehuss commented Apr 11, 2019

Nice!
@bors r+

@bors
Copy link
Collaborator

bors commented Apr 11, 2019

📌 Commit 22691b9 has been approved by ehuss

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 11, 2019
@bors
Copy link
Collaborator

bors commented Apr 11, 2019

⌛ Testing commit 22691b9 with merge 811a4f0...

bors added a commit that referenced this pull request Apr 11, 2019
Remove `Freshness` from `DependencyQueue`

Ever since the inception of Cargo and the advent of incremental
compilation at the crate level via Cargo, Cargo has tracked whether it
needs to recompile something at a unit level in its "dependency queue"
which manages when items are ready for execution. Over time we've fixed
lots and lots of bugs related to incremental compilation, and perhaps
one of the most impactful realizations was that the model Cargo started
with fundamentally doesn't handle interrupting Cargo halfway through and
resuming the build later.

The previous model relied upon implicitly propagating "dirtiness" based
on whether the one of the dependencies of a build was rebuilt or not.
This information is not available, however, if Cargo is interrupted and
resumed (or performs a subset of steps and then later performs more).
We've fixed this in a number of places historically but the purpose of
this commit is to put a nail in this coffin once and for all.

Implicit propagation of whether a unit is fresh or dirty is no longer
present at all. Instead Cargo should always know, irrespective of it's
in-memory state, whether a unit needs to be recompiled or not. This
commit actually turns up a few bugs in the test suite, so later commits
will be targeted at fixing this.

Note that this required a good deal of work on the `fingerprint` module
to fix some longstanding bugs (like #6780) and some serious hoops had to
be jumped through for others (like #6779). While these were fallout from
this change they weren't necessarily the primary motivation, but rather
to help make `fingerprints` a bit more straightforward in what's an
already confusing system!

Closes #6780
@bors
Copy link
Collaborator

bors commented Apr 11, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: ehuss
Pushing 811a4f0 to master...

@bors bors merged commit 22691b9 into rust-lang:master Apr 11, 2019
Centril added a commit to Centril/rust that referenced this pull request Apr 15, 2019
Update cargo

12 commits in 6f3e9c367abb497c64f360c3839dab5e74928d5c..ef0223f12597b5e0d9d2feed1b92c41306b1fc05
2019-04-04 14:11:33 +0000 to 2019-04-15 14:36:55 +0000
- Fix test include_overrides_gitignore. (rust-lang/cargo#6850)
- Clarify optional registry key behaviour (rust-lang/cargo#6851)
- Ensure Summary::checksum works for registry crates (rust-lang/cargo#6842)
- Better error if PathSource::walk can't access something. (rust-lang/cargo#6841)
- Improve warning in `cargo new` with parse error. (rust-lang/cargo#6839)
- Improve error message for `publish` key restriction. (rust-lang/cargo#6838)
- Remove `Freshness` from `DependencyQueue` (rust-lang/cargo#6832)
- testsuite: cleanup for `alternative-registries` (rust-lang/cargo#6837)
- Improve error message to rerun a test in a workspace. (rust-lang/cargo#6824)
- Fix mutable_borrow_reservation_conflict warning. (rust-lang/cargo#6829)
- Add install-upgrade. (rust-lang/cargo#6798)
- Clarify docs of install without <crate> (rust-lang/cargo#6823)
Centril added a commit to Centril/rust that referenced this pull request Apr 16, 2019
Update cargo

12 commits in 6f3e9c367abb497c64f360c3839dab5e74928d5c..ef0223f12597b5e0d9d2feed1b92c41306b1fc05
2019-04-04 14:11:33 +0000 to 2019-04-15 14:36:55 +0000
- Fix test include_overrides_gitignore. (rust-lang/cargo#6850)
- Clarify optional registry key behaviour (rust-lang/cargo#6851)
- Ensure Summary::checksum works for registry crates (rust-lang/cargo#6842)
- Better error if PathSource::walk can't access something. (rust-lang/cargo#6841)
- Improve warning in `cargo new` with parse error. (rust-lang/cargo#6839)
- Improve error message for `publish` key restriction. (rust-lang/cargo#6838)
- Remove `Freshness` from `DependencyQueue` (rust-lang/cargo#6832)
- testsuite: cleanup for `alternative-registries` (rust-lang/cargo#6837)
- Improve error message to rerun a test in a workspace. (rust-lang/cargo#6824)
- Fix mutable_borrow_reservation_conflict warning. (rust-lang/cargo#6829)
- Add install-upgrade. (rust-lang/cargo#6798)
- Clarify docs of install without <crate> (rust-lang/cargo#6823)
bors added a commit to rust-lang/rust that referenced this pull request Apr 16, 2019
Update cargo

16 commits in 6f3e9c367abb497c64f360c3839dab5e74928d5c..b6581d383ed596b133e330011658c6f83cf85c2f
2019-04-04 14:11:33 +0000 to 2019-04-16 16:02:11 +0000
- Fix new_warning_with_corrupt_ws missing "USER". (rust-lang/cargo#6857)
- Ignore Clippy redundant_closure (rust-lang/cargo#6855)
- Pass OsStr/OsString args through to the process spawned by cargo run. (rust-lang/cargo#6849)
- Bump to 0.37.0 (rust-lang/cargo#6852)
- Fix test include_overrides_gitignore. (rust-lang/cargo#6850)
- Clarify optional registry key behaviour (rust-lang/cargo#6851)
- Ensure Summary::checksum works for registry crates (rust-lang/cargo#6842)
- Better error if PathSource::walk can't access something. (rust-lang/cargo#6841)
- Improve warning in `cargo new` with parse error. (rust-lang/cargo#6839)
- Improve error message for `publish` key restriction. (rust-lang/cargo#6838)
- Remove `Freshness` from `DependencyQueue` (rust-lang/cargo#6832)
- testsuite: cleanup for `alternative-registries` (rust-lang/cargo#6837)
- Improve error message to rerun a test in a workspace. (rust-lang/cargo#6824)
- Fix mutable_borrow_reservation_conflict warning. (rust-lang/cargo#6829)
- Add install-upgrade. (rust-lang/cargo#6798)
- Clarify docs of install without <crate> (rust-lang/cargo#6823)
@alexcrichton alexcrichton deleted the freshness branch May 6, 2019 18:12
@ehuss ehuss added this to the 1.36.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build script fingerprints do not track dependencies to links build scripts.
5 participants