Better strict version hash (SVH) computation #14145

pnkfelix · 2014-05-12T17:13:34Z

Teach SVH computation to ignore more implementation artifacts.

In particular, this version of strict version hash (SVH) works much
like the deriving(Hash)-based implementation did, except that it
deliberately:

skips over content known not affect the generated crates, and,
uses a content-based hash for names instead of using the value of
the Name index itself, which can differ depending on the order
in which strings are interned (which in turn is affected by
e.g. the presence of --cfg options on the command line).

Fix #14132.

pnkfelix · 2014-05-12T17:13:43Z

r? @alexcrichton

alexcrichton · 2014-05-12T17:19:36Z

src/librustc/back/svh.rs

+            if i.vis == ast::Public {
+                SawItem.hash(self.st); walk_item(self, i, e);
+            }
+        }


This should use the results of the reachability pass instead of ast::Public due to things like reexports and generic functions and such.

i might just do the hash unconditionally in visit_item; e.g. a non-pub function may still get inlined in a downstream crate, and thus we should include it in the SVH, right? (I suppose the answer to that question depends on what I can learn via the reachability pass; I still need to look at that.)

The reachability pass will return all possibly user-facing functions, which should be what you need here. For example, we use the results of reachability to determine what symbols should be visible in the object file (others can link to), which is, in essence, the public ABI.

alexcrichton · 2014-05-12T17:24:58Z

I'm a little worried about this approach because it seems like it's very easy to sweep ABI-breaking changes under the rug by accident. For example, if a new method is added to the Visitor, then there's no guarantee that an implementation will be present for this Visitor.

I do think that it's the best approach, however, as hashing the AST just doesn't cut it. I think we'll just have time tell how this approach works.

Could you add some tests along the same lines of changing-crates.rs? The portions that are explicitly ignored in terms of hashing would be nice to keep track of.

alexcrichton · 2014-05-12T17:25:44Z

src/librustc/back/svh.rs

+        }
+
+        fn visit_ty(&mut self, t: &Ty, e: E) {
+            SawTy.hash(self.st); walk_ty(self, t, e);


I'm worried that this doesn't actually hash the type itself, what if the type changes from u16 to u8?

Hmm, true. Let me experiment and see if I can fix that.

alexcrichton · 2014-05-12T17:36:00Z

I've pointed out a few places where I think the hashing isn't necessarily sufficient. I'm worried about missing cases (I'm certainly no exhaustive check). Maybe we want to be conservative on this visitor approach...

pnkfelix · 2014-05-12T20:44:50Z

I just realized, some of your comments seem to indicate that you are worried about certain substructure being skipped (e.g. local definition, the patterns on match arms, etc). But I think all of those cases are covered by the recursive call to visit::walk_foo in each of those cases, which should descend down the relevant substructure.

In any case, I will make sure to add tests for those examples.

pnkfelix · 2014-05-13T01:47:52Z

(the previous version was pretty buggy in various nasty ways and probably collapsed too many distinct texts to the same hash, largely due to the mistaken way I incorporated visibility that alex pointed out. I will have a new PR up tomorrow that does a better job.)

pnkfelix · 2014-05-13T11:26:14Z

(added plus: writing tests forced me to discover that the PR as written here was still hashing on spans in the Lit case, and perhaps others. The effect of this, AFAICT, was that this PR as posted only fixed the --cfg stage1 issue; as soon as you made changes to the text, including just whitespace ones, the svh differed. yay testing!)

pnkfelix · 2014-05-13T11:29:38Z

I will reopen this when it is more fully baked.

pnkfelix · 2014-05-14T13:24:47Z

more fully baked now. :)

pnkfelix · 2014-05-14T13:25:46Z

r? @alexcrichton or @huonw (do note if you grab it so that we don't have needless duplicate review effort)

huonw · 2014-05-14T13:29:08Z

I'll glance over it quickly now, but I've only got 15 minutes or so.

huonw · 2014-05-14T13:30:42Z

src/libsyntax/visit.rs

+    fn got_path(&mut self, path: &Path, id: ast::NodeId, e: &E) -> After;
+}
+
+pub struct ForcedVisitor<HookState /*:ForcedVisitorHooks*/>{


Should this have a fixme? or is it just for documentation purposes?

The commented out bound? Its for documentation for now.

I guess with planned future changes to the language then we could uncomment it. I guess that's your point, that I should add a FIXME comment with a pointer to the RFC and/or tracking issue for the change to add trait-bounds to struct's type params?

Yeah, it just seems a little weird to have a commented-out piece of code with no explanation.

I do not see a tracking issue in the Rust repo for RFC 0011. I would consider adding a FIXME pointing to such a tracking issue, but without the tracking issue I'm not sure if I'd bother.

huonw · 2014-05-14T13:50:34Z

From my quick glance it seems fine to me, but I won't be able to read over it properly for 18+ hours (or more).

alexcrichton · 2014-05-14T15:14:50Z

src/libsyntax/visit.rs

@@ -61,7 +61,37 @@ pub fn generics_of_fn(fk: &FnKind) -> Generics {
    }
 }

+/// Each method of the Visitor trait is a hook to be potentially
+/// overriden.  Each method's default implementation recursvely visits


recursively

alexcrichton · 2014-05-14T15:21:57Z

I'm curious what you think about the introduction of a whole new visitor, but other than that this looks fantastic to me!

pnkfelix · 2014-05-14T16:03:49Z

@alexcrichton Yeah I have gone back and forth about whether the new visitor is worth it.

(The third option, assuming the first is Visitor and the second is ForcedVisitor, would be to not use a visitor at all, and instead write the code as a set of recursive functions that match the various enums directly. I briefly started on a rewrite to that pattern this morning, and quickly abandoned it: avoiding the boilerplate for the recursive traversals really is worth it.)

Anyway, as for this PR:

I am pretty happy with how the SVH client code looks in the end with the new ForcedVisitorHooks. And the other thing is that the ForcedVisitor does duplicate some code in "visit.rs", but it does manage to reuse the all of the existing visit::walk machinery, so its not quite as bad as if I had had to cut-and-paste all of "visit.rs"

I guess the right thing for me to do now would be to survey the existing Visitor implementations and check if any of them would look better as ForcedVisitorHooks instead. If I can find something else that would benefit, then I'll keep ForcedVisitor; otherwise, I'll drop it (and rewrite this code to use Visitor again). Sound good?

pnkfelix · 2014-05-14T16:04:21Z

@nikomatsakis You may have an opinion on the discussion regarding the Visitor API; we discussed it briefly during our meeting yesterday.

alexcrichton · 2014-05-14T16:24:59Z

That sounds good to me.

alexcrichton · 2014-05-14T16:25:18Z

r=me when you've completed your survey

pnkfelix · 2014-05-14T16:57:05Z

The summary is: I don't think I can easily apply ForcedVisitor to any of our existing visitors and have it actually pay off. So I'll take it out.

According to my survey:

There are 60 Visitor impls that override fewer than 6 of the default methods. Obviously ForcedVisitor, with its 26 required methods, would not be good there.
Then there are 7 Visitor impls that override 6--9 of the default methods. I think the same logic applies here (but I at least skimmed the code for these cases to double-check that assumption.
There are two Visitors left:
- The rustc::middle::lint Visitor for Context<'a> overrides 12 methods. I could believe the linter could one day grow to override more of the methods, to the point where ForcedVisitor would be worth considering. However, the architecture of the ForcedVisitorHooks is that the hook runs, and then returns a flag saying whether to recur or not. The problem then is that the lint visitor establishes state that it wants to clean up after the recursion returns, and the ForcedVisitor does not support that pattern of usage.
- The syntax::ast_util Visitor for IdVisitor<'a, O> overrides 16 methods. Again, skimming over this I started to think that maybe it could make use of the ForcedVisitor -- I was not able to convince myself that the existing code is not overlooking cases. However, the same problem as the previous bullet arises: there is local state that it wants to set up during the recursion and clean up afterwards, a pattern that is not supported by the ForcedVisitor API I posted.

So, I will revise the PR to take out ForcedVisitor, make sure the tests still pass, and then mark as reviewed.

Closes rust-lang#14210 (Make Vec.truncate() resilient against failure in Drop) Closes rust-lang#14206 (Register new snapshots) Closes rust-lang#14205 (use sched_yield on linux and freebsd) Closes rust-lang#14204 (Add a crate for missing stubs from libcore) Closes rust-lang#14201 (Render not_found with an absolute path to the rust stylesheet) Closes rust-lang#14198 (update valgrind headers) Closes rust-lang#14174 (Optimize common path of Once::doit) Closes rust-lang#14162 (Print 'rustc' and 'rustdoc' as the command name for --version) Closes rust-lang#14145 (Better strict version hash (SVH) computation)

pnkfelix · 2014-05-15T08:54:34Z

(rebasing)

…lts do. drive-by: added some doc.

In particular, this version of strict version hash (SVH) works much like the deriving(Hash)-based implementation did, except that uses a content-based hash that filters rustc implementation artifacts and surface syntax artifacts. Fix rust-lang#14132.

Namely: non-pub `use` declarations *are* significant to the SVH computation, since they can change which traits are part of the method resolution step, and thus affect which methods get called from the (potentially inlined) code.

(Only after adding the tests did I realize that this is not really a special case at the AST level; as far as the visitor is concerned, `int` and `i32` and `i64` are just idents.)

…excrichton Teach SVH computation to ignore more implementation artifacts. In particular, this version of strict version hash (SVH) works much like the deriving(Hash)-based implementation did, except that it deliberately: 1. skips over content known not affect the generated crates, and, 2. uses a content-based hash for names instead of using the value of the `Name` index itself, which can differ depending on the order in which strings are interned (which in turn is affected by e.g. the presence of `--cfg` options on the command line). Fix #14132.

petrochenkov · 2015-10-03T07:51:22Z

@pnkfelix @alexcrichton
I've noticed some parts of HIR are not hashed by SVH, some parts are hashed twice and some Idents (leaked through ExprInlineAsm) are still hashed as unstable numbers. Are these things worth fixing given the probabilistic nature of SVH, or it's good enough without extra polishing?

In particular, non-crate attributes seem to be intentionally not hashed at all, but they can easily affect ABI (repr, packed etc.).

alexcrichton · 2015-10-04T18:41:41Z

It's probably "good enough" today to get the job done, but I also wouldn't be confident in saying that the SVH reflects the "true ABI" of a crate either, so in that sense I'm sure there's a few bugs here and there :)

pnkfelix mentioned this pull request May 12, 2014

Use a regex filter for the test runner & assorted fixes #13948

Merged

alexcrichton reviewed May 12, 2014
View reviewed changes

pnkfelix closed this May 13, 2014

pnkfelix reopened this May 14, 2014

huonw reviewed May 14, 2014
View reviewed changes

alexcrichton reviewed May 14, 2014
View reviewed changes

pnkfelix mentioned this pull request May 15, 2014

add attribute on impl and associated lint for "no default methods" #14220

Closed

pnkfelix added 5 commits May 15, 2014 10:55

syntax::visit: pub walk_explicit_self so impls can call it as defau…

d6cf0df

…lts do. drive-by: added some doc.

Some basic acceptance tests for better SVH.

a92d162

Added tests checking that changes in type sig are recognized in SVH.

5236af8

(Only after adding the tests did I realize that this is not really a special case at the AST level; as far as the visitor is concerned, `int` and `i32` and `i64` are just idents.)

bors closed this May 15, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better strict version hash (SVH) computation #14145

Better strict version hash (SVH) computation #14145

pnkfelix commented May 12, 2014

pnkfelix commented May 12, 2014

alexcrichton May 12, 2014

pnkfelix May 12, 2014

pnkfelix May 12, 2014

alexcrichton May 12, 2014

alexcrichton commented May 12, 2014

alexcrichton May 12, 2014

pnkfelix May 12, 2014

alexcrichton commented May 12, 2014

pnkfelix commented May 12, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 14, 2014

huonw commented May 14, 2014

huonw May 14, 2014

pnkfelix May 14, 2014

huonw May 14, 2014

pnkfelix May 14, 2014

huonw commented May 14, 2014

alexcrichton May 14, 2014

alexcrichton commented May 14, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 14, 2014

alexcrichton commented May 14, 2014

alexcrichton commented May 14, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 15, 2014

petrochenkov commented Oct 3, 2015

alexcrichton commented Oct 4, 2015

Better strict version hash (SVH) computation #14145

Better strict version hash (SVH) computation #14145

Conversation

pnkfelix commented May 12, 2014

pnkfelix commented May 12, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented May 12, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented May 12, 2014

pnkfelix commented May 12, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 13, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 14, 2014

huonw commented May 14, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw commented May 14, 2014

Choose a reason for hiding this comment

alexcrichton commented May 14, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 14, 2014

alexcrichton commented May 14, 2014

alexcrichton commented May 14, 2014

pnkfelix commented May 14, 2014

pnkfelix commented May 15, 2014

petrochenkov commented Oct 3, 2015

alexcrichton commented Oct 4, 2015