Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use smaller def span for functions #75465

Merged
merged 1 commit into from
Aug 23, 2020

Conversation

Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Aug 12, 2020

Currently, the def span of a function encompasses the entire function
signature and body. However, this is usually unnecessarily verbose - when we are
pointing at an entire function in a diagnostic, we almost always want to
point at the signature. The actual contents of the body tends to be
irrelevant to the diagnostic we are emitting, and just takes up
additional screen space.

This commit changes the def_span of all function items (freestanding
functions, impl-block methods, and trait-block methods) to be the
span of the signature. For example, the function

pub fn foo<T>(val: T) -> T { val }

now has a def_span corresponding to pub fn foo<T>(val: T) -> T
(everything before the opening curly brace).

Trait methods without a body have a def_span which includes the
trailing semicolon. For example:

trait Foo {
    fn bar();
}

the function definition Foo::bar has a def_span of fn bar();

This makes our diagnostic output much shorter, and emphasizes
information that is relevant to whatever diagnostic we are reporting.

We continue to use the full span (including the body) in a few of
places:

  • MIR building uses the full span when building source scopes.
  • 'Outlives suggestions' use the full span to sort the diagnostics being
    emitted.
  • The #[rustc_on_unimplemented(enclosing_scope="in this scope")]
    attribute points the entire scope body.

All of these cases work only with local items, so we don't need to
add anything extra to crate metadata.

@rust-highfive
Copy link
Collaborator

r? @lcnr

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 12, 2020
LL | | }
| |_^ expected `(T, T)`, got `(U, T)`
LL | fn three<T: Copy + Debug, U: Debug>(t: T, u: U) -> Two<T, U> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `(T, T)`, got `(U, T)`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems slightly worse than the previous message. However, I think this should really be pointing to a value being returned (either an explicit return or the trailing expression). This is a nightly feature, so I think it's fine to improve this later (e.g. before stabilization) as needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably should do so in general, but that isn't bad enough for me to block this pr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard agree on all points.

Copy link
Contributor

@lcnr lcnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like these changes 👍

cc @estebank I would expect that you are also interested in this.

LL | |
... |
LL | | (a, b)
LL | | }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For impl Trait it would be helpful to see the return expression here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems slightly worse than the previous message. However, I think this should really be pointing to a value being returned (either an explicit return or the trailing expression). This is a nightly feature, so I think it's fine to improve this later (e.g. before stabilization) as needed.

pretty much this

@@ -10,11 +10,7 @@ note: `rec` defined here
LL | / fn rec<T>(mut it: T)
LL | | where
LL | | T: Iterator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally might even prefer to exclude where bounds from the function signature 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we only point at fn rec<T>(mut it: T) here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to display the where clause, so switching from fn<T: Copy>() {} to fn<T>() where T: Copy {} doesn't change the diagnostic output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get what you mean here, wouldn't that change change the diagnostic output from fn<T: Copy>() to fn<T>() where T: Copy with this PR?

I think that diagnostics where where bounds are relevant mention the where bound itself, so I think that not mentioning them in the signature is worth it as it can remove noise.

Copy link
Member Author

@Aaron1011 Aaron1011 Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I meant that if a user switches from an 'inline' <T: Copy> to a 'where-clause' where T: Copy, they should still see the same amount of information in diagnostics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I am not sure how often these bounds help and are not explicitly references and put a fairly high value into single line error spans.

But that's just a personal preference and I am fine with landing this as is, so r=me. We may want to wait a day or two in case @estebank want's to look over this but otherwise I am satisfied

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the safe approach will be to include them.

LL | | }
| |_^ expected `(T, T)`, got `(U, T)`
LL | fn three<T: Copy + Debug, U: Debug>(t: T, u: U) -> Two<T, U> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `(T, T)`, got `(U, T)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably should do so in general, but that isn't bad enough for me to block this pr

@Aaron1011
Copy link
Member Author

@estebank: Do you have any objections to this change?

Copy link
Contributor

@estebank estebank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100 from me. Left a couple of nitpicks. We need to be carefully going forward to not use this new span when we actually want to point at the body, but it should be fine.

src/librustc_middle/hir/map/mod.rs Outdated Show resolved Hide resolved
let body = self.parse_fn_body(attrs)?; // `;` or `{ ... }`.
Ok((ident, FnSig { header, decl }, generics, body))

let mut sig_hi = self.prev_token.span;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut sig_hi = self.prev_token.span;
let sig_hi = if self.check(&token::Semi) {
// Include the trailing semicolon in the span of the signature
self.token.span
} else {
self.prev_token.span
};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will introduce a duplicate check for token::Semi - I wanted to make sure that the determination of sig_hi always stays in sync with the actual parsing logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fair enough it's not critical

Comment on lines +1518 to +1519
// Include the trailing semicolon in the span of the signature
*sig_hi = self.token.span;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Include the trailing semicolon in the span of the signature
*sig_hi = self.token.span;

fn parse_fn_body(
&mut self,
attrs: &mut Vec<Attribute>,
sig_hi: &mut Span,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sig_hi: &mut Span,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above

@@ -10,11 +10,7 @@ note: `rec` defined here
LL | / fn rec<T>(mut it: T)
LL | | where
LL | | T: Iterator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the safe approach will be to include them.

Comment on lines +17 to +18
LL | fn get_ctxt(&self) -> &'a Ctxt {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is slightly unfortunate given the message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should only highlight the part from { up to and including } and not the function signature itself?

LL | | }
| |_^ expected `(T, T)`, got `(U, T)`
LL | fn three<T: Copy + Debug, U: Debug>(t: T, u: U) -> Two<T, U> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `(T, T)`, got `(U, T)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard agree on all points.

@estebank
Copy link
Contributor

(Sorry about the delay in review, I've been having trouble staying on top of my notification volume lately.)

@Aaron1011
Copy link
Member Author

@estebank: I've addressed your comments. My concern with removing sig_hi: &mut Span is that computation of the def span could eventually get out of sync with the tokens that we actually consume.

@estebank
Copy link
Contributor

r? @estebank @bors r+

@bors
Copy link
Contributor

bors commented Aug 22, 2020

📌 Commit 8b886a32350739abe63a0f57cac0a7be0092b737 has been approved by estebank

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 22, 2020
@bors
Copy link
Contributor

bors commented Aug 22, 2020

⌛ Testing commit 8b886a32350739abe63a0f57cac0a7be0092b737 with merge f1e1f6d98a3a41673551f10c462a0c72be3db53e...

@bors
Copy link
Contributor

bors commented Aug 22, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 22, 2020
Currently, the def span of a funtion encompasses the entire function
signature and body. However, this is usually unnecessarily verbose - when we are
pointing at an entire function in a diagnostic, we almost always want to
point at the signature. The actual contents of the body tends to be
irrelevant to the diagnostic we are emitting, and just takes up
additional screen space.

This commit changes the `def_span` of all function items (freestanding
functions, `impl`-block methods, and `trait`-block methods) to be the
span of the signature. For example, the function

```rust
pub fn foo<T>(val: T) -> T { val }
```

now has a `def_span` corresponding to `pub fn foo<T>(val: T) -> T`
(everything before the opening curly brace).

Trait methods without a body have a `def_span` which includes the
trailing semicolon. For example:

```rust
trait Foo {
    fn bar();
}```

the function definition `Foo::bar` has a `def_span` of `fn bar();`

This makes our diagnostic output much shorter, and emphasizes
information that is relevant to whatever diagnostic we are reporting.

We continue to use the full span (including the body) in a few of
places:

* MIR building uses the full span when building source scopes.
* 'Outlives suggestions' use the full span to sort the diagnostics being
  emitted.
* The `#[rustc_on_unimplemented(enclosing_scope="in this scope")]`
attribute points the entire scope body.
* The 'unconditional recursion' lint uses the full span to show
  additional context for the recursive call.

All of these cases work only with local items, so we don't need to
add anything extra to crate metadata.
@Aaron1011
Copy link
Member Author

I needed to bless the NLL tests.

@estebank: I've squashed the commits, and updated the UNCONDITIONAL_RECURSION lint to use the full body span. This preserves the current lint output, which shows useful additional context around the recursive call.

@estebank
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Aug 22, 2020

📌 Commit e3cd43e has been approved by estebank

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 22, 2020
@bors
Copy link
Contributor

bors commented Aug 23, 2020

⌛ Testing commit e3cd43e with merge d5ba3ef...

@bors
Copy link
Contributor

bors commented Aug 23, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: estebank
Pushing d5ba3ef to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 23, 2020
@bors bors merged commit d5ba3ef into rust-lang:master Aug 23, 2020
Aaron1011 added a commit to Aaron1011/rust that referenced this pull request Sep 19, 2020
This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Sep 19, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Sep 19, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Sep 19, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Aaron1011 added a commit to Aaron1011/rust that referenced this pull request Sep 21, 2020
This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Sep 22, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
ecstatic-morse added a commit to ecstatic-morse/rust that referenced this pull request Sep 22, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Sep 23, 2020
Record `tcx.def_span` instead of `item.span` in crate metadata

This was missed in PR rust-lang#75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
Aaron1011 added a commit to Aaron1011/rust that referenced this pull request Sep 26, 2020
Currently, we serialize the same crate metadata for proc-macro crates as
we do for normal crates. This is quite wasteful - almost none of this
metadata is ever used, and much of it can't even be deserialized (if it
contains a foreign `CrateNum`).

This PR changes metadata encoding to skip encoding the majority of crate
metadata for proc-macro crates. Most of the `Lazy<[T]>` fields are left
completetly empty, while the non-lazy fields are left as-is.

Additionally, proc-macros now have a def span that does not include
their body. This was done for normal functions in rust-lang#75465, but was missed
for proc-macros.

As a result of this PR, we should only ever encode local `CrateNum`s
when encoding proc-macro crates. I've added a specialized serialization
impl for `CrateNum` to assert this.
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 26, 2020
…data, r=petrochenkov

Encode less metadata for proc-macro crates

Currently, we serialize the same crate metadata for proc-macro crates as
we do for normal crates. This is quite wasteful - almost none of this
metadata is ever used, and much of it can't even be deserialized (if it
contains a foreign `CrateNum`).

This PR changes metadata encoding to skip encoding the majority of crate
metadata for proc-macro crates. Most of the `Lazy<[T]>` fields are left
completetly empty, while the non-lazy fields are left as-is.

Additionally, proc-macros now have a def span that does not include
their body. This was done for normal functions in rust-lang#75465, but was missed
for proc-macros.

As a result of this PR, we should only ever encode local `CrateNum`s
when encoding proc-macro crates. I've added a specialized serialization
impl for `CrateNum` to assert this.
@cuviper cuviper added this to the 1.47.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants