Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorporate liveness computation into this crate #104

Closed
nikomatsakis opened this issue Mar 21, 2019 · 13 comments
Closed

incorporate liveness computation into this crate #104

nikomatsakis opened this issue Mar 21, 2019 · 13 comments
Assignees

Comments

@nikomatsakis
Copy link
Contributor

This is part of a general move to incorporate more of the borrow check computations into polonius itself. The overall goal here is that rustc will supply polonius with facts that say "this variable is defined here, this variable is used there" as well as which regions appear within the variable's type, and polonius will compute the region_live_at relation based on that (right now, the region_live_at relation is an input to polonius).

We covered the overall plan in this YouTube video.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Mar 21, 2019

Part 1: Computing which variables are live where in polonius.

Step listing

  • Extend AllFacts with a new type parameter V: Atom -- V stands for "local variable"
  • Extend AllFacts with a field var_used: Vec<(V, P)> -- meaning "variable V is used at the point P"
  • Extend AllFacts with a field var_defined: Vec<(V, P)> -- meaning that the value in V is overwritten by the point P
  • Extend the polonius parser Fact type and grammar definition to permit var_used and var_defined facts at each point. This parser is used when writing unit tests.
  • Extend the Output type to include a var_live_at field; this will be used only for unit testing.
  • Get this to compile :)
  • Extend the naive analysis with datafrog versions of the datalog rules below to compute var_live_at.
  • Extend the "dump-enabled" code to write out the var_live_at results we compute below into that field we added to Output
  • Write some unit tests similar in style to this one

Datafrog rules

First, a variable is live on entry to P if it is used there:

var_live(V, P) :- var_used(V, P)

Next, a variable is live on entry to P if it is used in a successor Q and P does not reassign it:

var_live(V, P) :-
    var_live(V, Q),
    cfg_edge(P, Q),
    !var_defined(V, P).

At this point, we should be able to write some unit tests where we supply some base facts and a control-flow graph and then test var_live at those points.

@nikomatsakis
Copy link
Contributor Author

Once this is done, the next step will be to extend rustc to produce the var_used and var_defined facts.

@nikomatsakis
Copy link
Contributor Author

Part 2

Extending rustc to produce the var_used and var_defined facts.

First off, you will want to modify rustc to work with a local copy of polonius, so that you can edit the two in tandem. I think @AlbinS you've already figured this part out so I'll just assume you have done so.

We're going to want to instantiate the polonius output with the "variable id" mapped to the mir::Local type.

The existing NLL liveness computation code for rustc lives in liveness/mod.rs -- note that there are other files named liveness.rs that are totally independent. =) We won't really be re-using much of it, but it has some bits and pieces that are useful.

In fact, the LocalUseMap code seems pretty close to exactly what we want. This is a map that maps, from each local, to the points in the IR where it is defined, used, etc. It is constructed in a single walk over the MIR using the mir visitor. This has a callback that is invoked for each use of a local variable, and we use the context provided by that callback to figure out how to categorize the use.

For our purposes, a var_used fact results from each "use", a var_defined fact from each "def". I'm ignoring drops for now but those are a distinct kind of use.

One twist of the liveness code is that it can sometimes only compute liveness for a subset of the variables -- those that might possible be interesting. This is what is going on in this part of the code. We presently disable that optimization when in polonius-fact mode, however, and I imagine we should just continue doing that for now.

So basically what we would want to do I think is to generate the var-defined and var-used facts by doing the same sort of walk as the map does. But instead of inserting those facts into the LocalUseMap data structure, we would push new facts onto the vectors. We should probably just generate them for all the locals, as I said.

The liveness code has acecss to the all_facts array via cx.borrowck_context.as_mut() (the facts are in a field called all_facts of the borrow-check context).

So I think what I would do is to extend the liveness::generate function to make it also fill in these new facts. I would add a new module liveness/polonius.rs that would contain the code needed to do so -- it will be similar to the visitor from the local_use_map.rs that I referenced earlier. We would call this function from generate.

@nikomatsakis
Copy link
Contributor Author

So I think that the next step is to compute "region-live-at" based on the liveness information. The basic idea here is this:

  • If a variable V is "use live" at some point P
  • That variable has a type T
  • That type contains some regions R
  • Then for each region R, R is live at P

The rules for drop live regions values are the same, except that we don't use the full set of regions R in that case.

To model this in polonius, then, I think we need another few relations. There are a few ways we could do it. Here is one way:

  • var_uses_region(V, R) -- true if the type of V includes the region R
  • var_drops_region(V, R) -- true if the type of V includes the region R and we will use it when dropping

Then we would want to make some rules like:

region_live_at(R, P) :- 
  var_live(V, P),
  var_uses_region(V, R).

The work then would be to:

  • modify polonius to add the var_uses_region relation and new region_live_at rules
  • modify rustc to product the new var_uses_region relation

@nikomatsakis
Copy link
Contributor Author

I think the way I would structure the previous steps is as follows:

  • extend polonius with var_uses_region
  • extend polonius with the new rule for region_live_at, but also keep the existing input, so that people can either supply the region_live_at relations themselves or supply the var-live and var-uses-region tuples (this means existing tests will keep passing)
  • modify rustc to start producing the var_uses_region tuples, and to stop producing region_live_at

At this point, the results will no longer be correct for drops, though, so we have to start producing var_drops_region, too.

@nikomatsakis
Copy link
Contributor Author

The var_drops_region needs to integrate with a notion of "drop-live", so maybe it actually makes sense to do that work first. Basically we need to:

  • add something analagous to the "use" input that we added before, but for "drop-uses"
  • add a drop-live computation analogous to the var_live, but using "drop-uses"

@amandasystems
Copy link
Contributor

The var_drops_region needs to integrate with a notion of "drop-live", so maybe it actually makes sense to do that work first. Basically we need to:

* add something analagous to the "use" input that we added before, but for "drop-uses"

* add a drop-live computation analogous to the `var_live`, but using "drop-uses"

I see; and that should be straightforward because it's the same process as for uses and definitions, just another arm of the match statement in rustc. I'll start there!

@amandasystems
Copy link
Contributor

* `var_drops_region(V, R)` -- true if the type of `V` includes the region `R` and we will use it when dropping

And this is an input field like var_used_region? I am not entirely sure about when the dropping happens; is this relation fulfilled at any time when the region is dropped? Does this mean that we have something like this:

let x: &'region = something_that_gives_a_reference();
// ... code
drop(x); // this <=> var_drops_region(x, 'region)

@amandasystems
Copy link
Contributor

The var_drops_region needs to integrate with a notion of "drop-live", so maybe it actually makes sense to do that work first. Basically we need to:

* add something analagous to the "use" input that we added before, but for "drop-uses"

* add a drop-live computation analogous to the `var_live`, but using "drop-uses"

I see; and that should be straightforward because it's the same process as for uses and definitions, just another arm of the match statement in rustc. I'll start there!

This is now implemented!

@nikomatsakis
Copy link
Contributor Author

@AlbinS

Regarding var_drops_region -- note that this relation does not include any "points". This is based purely on the type of the variable.

The idea is: when this variable V is dropped, then the var_drop_region(V, R) relation enumerates the regions R that might be dereferenced during the drop.

However, I am also remembering now that the actual logic implemented by the NLL checker is a bit more complex than we originally envisioned in the RFC. I'll have to review what precisely we are doing there, but iirc we (at minimum) take into account initialization of variables and things. That said, I suspect we can ignore this for the first iteration, and come back to it after we get some other pieces (notably, the initialization analysis!) in place.

@nikomatsakis
Copy link
Contributor Author

So it seems like the next step, @AlbinS, is that we need to generate the var_uses_region and var_drops_region fact sets.

The current code that generates those in rustc is here:

  • the add_use_live_facts_for function computes the regions for which a variable is use-live; it is invoked with the type of the variable value;
  • the add_drop_live_facts_for function does the same for drop-live facts; as I suggested on Zulip, I think we can ignore this for now and start by generating an over-approximation

Both of these make use of the make_all_regions_live helper function. That function iterates over all the regions that are live in value, which is either the type of a variable or (in the case of drop-live) part of the variable's type.

You can see that we also emit the region_live_at facts for polonius in that same helper.

One difference is that, in the existing code, these functions also take a set of points live_at where the regions are live, but in the new analysis, we won't have that information. Instead, we'll generate tuples that are not connected to any point in particular, and let polonius handle the rest.

So I think what you want to do in your branch is to extend the "popular var liveness function" so that it iterates over the variables declared in mir (the mir.local_decls field). You can do something like:

for (local, local_decl) in mir.local_decls.iter_enumerated() {
    ...
}

then, in the loop, get the type of each local from the local_decl.ty field and iterate over all its regions:

for ... {
    add_regions(local, local_decl.ty, &mut all_facts.var_uses_regions);
    add_regions(local, local_decl.ty, &mut all_facts.var_drops_regions); // FIXME(polonius#104) overapproximation
}

and then we would have to define add_regions to use the for_each_free_region function:

fn add_regions(tcx: TyCtxt<'_, 'tcx, '_>, local: Local, ty: Ty<'tcx>, output_relation: &mut Vec<(...)>) {
    tcx.for_each_free_region(..., |r| output_relation.push((local, r));
}

Something like that, anyway.

@amandasystems
Copy link
Contributor

amandasystems commented May 14, 2019

The following is left to do (for me):

  • Verify that the drop-related compile-tests produce the same region_live_at as rustc
  • Make unit tests out of those examples
  • Remove the temporary region-live-at compare from the Polonius CLI and verify it now works again for multiple inputs
  • Remove the logic to generate region_live_at from rustc
  • Verify that the previous step doesn't break anything in the compile-tests that was not broken before
  • Move the liveness calculations to happen before the call to each algorithm's compute()method as a pre-computation step
  • Extend polonius-parser with var_(uses|drops)_region

Bonus tasks:

  • Investigate why the comparisons for DatafrogOpt fails when regenerating new facts for the issue-4768 inputs

bors added a commit to rust-lang/rust that referenced this issue Jul 13, 2019
Fact generation for liveness calculations in Polonius

This PR tracks ongoing work to extend `rustc` with support for generating variable use, definition, and later also drop output for the Polonius solver, the whole of which is being tracked in [Polonius Issue #104](rust-lang/polonius#104).
@amandasystems
Copy link
Contributor

This has now been merged. Closing!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants