-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Big performance change) Do not run lints that cannot emit #125116
base: master
Are you sure you want to change the base?
Conversation
rustbot has assigned @michaelwoerister. Use |
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
cc @nnethercote @Kobzol, the perf wizards. Could you please give this PR a look and tell me if there are any obvious performance issues on the filtering? |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…r=<try> (Big performance change) Do not run lints that cannot emit Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!) So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either: - Manually `#![allow]`ed in the whole crate, - Allowed in the command line, or - Not manually enabled with `#[warn]` or similar, and its default level is `Allow` I open this PR to receive some feedback, mainly related to performance. We have lots of `Lock`s, `with_lock` and similar functions (also lots of cloning), so the filtering performance is not the best. In an older iteration, instead of doing this in the parsing phase, we developed a visitor with the same function but without so many locks, would reverting to that change help? I'm not sure tbh.
This comment has been minimized.
This comment has been minimized.
@lqd haven't you tried something like this before? 🤔 |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
We've tried a few different things yes, and so has blyxyas -- it maybe wasn't exactly like this, but I encountered annoyances like: some slow const eval loadbearing lint that shouldn't be ignored, lints that would be allowed unexpectedly because cargo allows lints unconditionally on dependencies (arguably the most common usage, and where perf gains would show up AFAICT) but some may trigger FCWs or are required to lint on dependencies despite being allowed, et cetera. Refactoring and fixing all these were too costly compared to the gains at the time, as rustc's lints were fast enough on dependencies, also a "rarer" use-case. That being said, we've added and uplifted more lints since then, including possibly costly ones like the non local impls one, and the situation may also be different for clippy itself (but we won't see that in the perf.rlo results, only locally with the clippy dedicated commands IIUC) |
Finished benchmarking commit (cc1d40f): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 2.3%, secondary -0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 676.788s -> 676.098s (-0.10%) |
The benchmark doesn't check clippy, right? As lqd hinted at as well? And without splitting allow-by-default rustc lints it does nothing without clippy, so I think this just shows how much time it takes to filter them (can someone else confirm this :3c) Thus, basically nothing it seems :3 (So @blyxyas maybe the cloning is ok?) |
Yeah, the benchmarks currently doesn't check Clippy, that's why I'm currently benchmarking on a different server via SSH (A server that we got explicitly to benchmark Clippy). I'll post the results here when they arrive :) Also, it currently doesn't check builtin lints because I'm having some issues checking that. That's also part of why I decided to open the PR, maybe someone has some idea (I'll see if I can read the previous attempts by lqd, maybe I can learn something from them) EDIT: Seems like lqd hasn't pushed their attempts, I'll have to keep trying new approaches by myself. |
Okis, here are the results (Wall time, Clippy) Wall time❌ Max RSS❌ Instructions❌ Cycles[ +0.42%, +25.48%] |
Those wall times are proof that this optimization has a lot of potential, the main drawback is that the filtering / parsing code is not fast enough, so in some scenarios that I'm not really able to determine exactly what do they have in common, the optimization goes backwards. But a ~70% in Wall time, that's great and we should look more into it. |
I wouldn't draw too many conclusions from these results, they seem to be quite unstable (there is also a 190% walltime regression). Note that even for PRs that don't have large perf. impacts, we can see ~30% walltime swings even on the stable benchmarking server (https://perf.rust-lang.org/compare.html?start=9e7aff794539aa040362f4424eb29207449ffce0&end=44fa5fd39a1d2af41bd7f43bc246a5e4f6d94696&stat=wall-time&nonRelevant=true). |
I've changed the system, we're back to using visitors (I've benchmarked this new commit, it should have 0 regressions and about -0.66% improvement) @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (50e9dfe): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (secondary 3.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 3.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 750.76s -> 750.368s (-0.05%) |
I've done some benchmarks on the new commit on Clippy on server Summary
|
Benchmark | Profile | Scenario | % Change | Significance Factor |
---|---|---|---|---|
webrender-2022 | clippy | incr-full | -4.66% | 23.28x |
webrender-2022 | clippy | full | -4.19% | 20.97x |
ripgrep-13.0.0 | clippy | incr-unchanged | -3.64% | 18.21x |
ripgrep-13.0.0 | clippy | incr-patched: println | -3.48% | 17.38x |
regex-1.5.5 | clippy | incr-patched: Job | -2.92% | 14.59x |
unicode-normalization-0.1.19 | clippy | incr-full | -2.89% | 14.46x |
syn-1.0.89 | clippy | full | -2.59% | 12.94x |
clap-3.1.6 | clippy | full | -2.54% | 12.68x |
syn-1.0.89 | clippy | incr-full | -2.52% | 12.62x |
unicode-normalization-0.1.19 | clippy | incr-patched: println | -2.44% | 12.18x |
clap-3.1.6 | clippy | incr-full | -2.40% | 11.98x |
cargo-0.60.0 | clippy | incr-full | -2.33% | 11.65x |
unicode-normalization-0.1.19 | clippy | incr-unchanged | -2.24% | 11.18x |
cargo-0.60.0 | clippy | full | -2.23% | 11.14x |
unicode-normalization-0.1.19 | clippy | full | -2.03% | 10.13x |
ripgrep-13.0.0 | clippy | full | -1.97% | 9.83x |
serde_derive-1.0.136 | clippy | full | -1.94% | 9.69x |
ripgrep-13.0.0 | clippy | incr-full | -1.93% | 9.63x |
regex-1.5.5 | clippy | incr-full | -1.91% | 9.54x |
serde_derive-1.0.136 | clippy | incr-full | -1.91% | 9.53x |
diesel-1.4.8 | clippy | incr-patched: println | -1.81% | 9.07x |
cranelift-codegen-0.82.1 | clippy | incr-full | -1.74% | 8.68x |
cranelift-codegen-0.82.1 | clippy | full | -1.55% | 7.73x |
diesel-1.4.8 | clippy | incr-unchanged | -1.50% | 7.49x |
exa-0.10.1 | clippy | full | -1.40% | 6.99x |
exa-0.10.1 | clippy | incr-full | -1.36% | 6.80x |
hyper-0.14.18 | clippy | full | -1.25% | 6.24x |
regex-1.5.5 | clippy | full | -1.23% | 6.14x |
diesel-1.4.8 | clippy | full | -1.19% | 5.93x |
diesel-1.4.8 | clippy | incr-full | -1.18% | 5.89x |
image-0.24.1 | clippy | incr-full | -1.16% | 5.79x |
html5ever-0.26.0 | clippy | full | -1.13% | 5.65x |
hyper-0.14.18 | clippy | incr-unchanged | 1.09% | 5.43x |
bitmaps-3.1.0 | clippy | incr-full | -1.05% | 5.27x |
cargo-0.60.0 | clippy | incr-patched: println | -1.01% | 5.03x |
image-0.24.1 | clippy | full | -0.98% | 4.90x |
cranelift-codegen-0.82.1 | clippy | incr-patched: println | -0.98% | 4.90x |
regex-1.5.5 | clippy | incr-patched: is valid cap letter | -0.92% | 4.58x |
bitmaps-3.1.0 | clippy | full | -0.90% | 4.49x |
exa-0.10.1 | clippy | incr-patched: printlns | -0.88% | 4.39x |
html5ever-0.26.0 | clippy | incr-full | -0.78% | 3.89x |
image-0.24.1 | clippy | incr-patched: println | -0.75% | 3.73x |
stm32f4-0.14.0 | clippy | incr-patched: negate | -0.64% | 3.20x |
webrender-2022 | clippy | incr-patched: println | -0.62% | 3.08x |
clap-3.1.6 | clippy | incr-patched: println | -0.61% | 3.07x |
regex-1.5.5 | clippy | incr-patched: byte frequencies | 0.59% | 2.93x |
helloworld | clippy | incr-unchanged | 0.55% | 2.74x |
bitmaps-3.1.0 | clippy | incr-patched: println | 0.53% | 2.64x |
cargo-0.60.0 | clippy | incr-unchanged | -0.50% | 2.51x |
helloworld | clippy | incr-patched: println | 0.47% | 2.34x |
typenum-1.17.0 | clippy | incr-patched: add fn | -0.44% | 2.18x |
helloworld | clippy | incr-full | 0.41% | 2.03x |
serde-1.0.136 | clippy | incr-patched: println | -0.39% | 1.96x |
serde-1.0.136 | clippy | incr-unchanged | -0.39% | 1.95x |
libc-0.2.124 | clippy | incr-full | -0.36% | 1.78x |
syn-1.0.89 | clippy | incr-patched: println | -0.35% | 1.76x |
regex-1.5.5 | clippy | incr-patched: compile one | -0.35% | 1.74x |
libc-0.2.124 | clippy | incr-patched: clone | -0.34% | 1.68x |
serde-1.0.136 | clippy | full | -0.32% | 1.61x |
exa-0.10.1 | clippy | incr-unchanged | -0.30% | 1.52x |
hyper-0.14.18 | clippy | incr-patched: println | 0.30% | 1.51x |
Secondary benchmarks
Benchmark | Profile | Scenario | % Change | Significance Factor |
---|---|---|---|---|
deep-vector | clippy | full | 6.38% | 31.92x |
regression-31157 | clippy | full | -4.89% | 24.47x |
regression-31157 | clippy | incr-full | -4.75% | 23.74x |
unused-warnings | clippy | incr-full | -1.94% | 9.72x |
wg-grammar | clippy | full | -1.76% | 8.79x |
ripgrep-13.0.0-tiny | clippy | full | -1.74% | 8.70x |
tuple-stress | clippy | incr-unchanged | -1.60% | 7.99x |
ucd | clippy | incr-unchanged | -1.51% | 7.54x |
wg-grammar | clippy | incr-full | -1.47% | 7.33x |
issue-46449 | clippy | incr-unchanged | 1.29% | 6.43x |
coercions | clippy | incr-full | -1.28% | 6.38x |
unused-warnings | clippy | full | -1.23% | 6.13x |
deeply-nested-multi | clippy | incr-full | -1.17% | 5.86x |
deeply-nested-multi | clippy | full | -1.17% | 5.84x |
issue-46449 | clippy | incr-full | 0.97% | 4.87x |
regression-31157 | clippy | incr-patched: println | -0.94% | 4.72x |
tt-muncher | clippy | incr-unchanged | 0.89% | 4.45x |
coercions | clippy | incr-unchanged | -0.86% | 4.30x |
ctfe-stress-5 | clippy | incr-full | -0.81% | 4.05x |
coercions | clippy | full | -0.78% | 3.88x |
unused-warnings | clippy | incr-patched: dummy fn | -0.74% | 3.68x |
regression-31157 | clippy | incr-unchanged | -0.73% | 3.65x |
unused-warnings | clippy | incr-unchanged | 0.71% | 3.54x |
tt-muncher | clippy | full | 0.67% | 3.36x |
externs | clippy | incr-full | -0.67% | 3.35x |
wf-projection-stress-65510 | clippy | incr-unchanged | 0.63% | 3.17x |
match-stress | clippy | incr-unchanged | -0.63% | 3.15x |
issue-88862 | clippy | full | -0.60% | 2.99x |
unify-linearly | clippy | incr-patched: dummy fn | -0.59% | 2.94x |
many-assoc-items | clippy | incr-full | -0.58% | 2.88x |
ctfe-stress-5 | clippy | full | -0.56% | 2.78x |
tuple-stress | clippy | incr-full | -0.55% | 2.76x |
issue-46449 | clippy | incr-patched: empty 3072 | 0.55% | 2.75x |
ucd | clippy | incr-full | -0.55% | 2.73x |
deeply-nested-multi | clippy | incr-unchanged | -0.54% | 2.71x |
tuple-stress | clippy | incr-patched: new row | -0.54% | 2.69x |
issue-88862 | clippy | incr-full | -0.53% | 2.64x |
many-assoc-items | clippy | incr-unchanged | -0.51% | 2.56x |
helloworld-tiny | clippy | full | 0.48% | 2.42x |
coercions | clippy | incr-patched: add static arr item | -0.47% | 2.37x |
issue-46449 | clippy | incr-patched: io error 6144 | 0.46% | 2.29x |
unify-linearly | clippy | incr-unchanged | 0.44% | 2.20x |
issue-46449 | clippy | incr-patched: u32 3072 | 0.41% | 2.06x |
many-assoc-items | clippy | full | -0.40% | 1.99x |
issue-46449 | clippy | incr-patched: u8 3072 | 0.39% | 1.95x |
ucd | clippy | full | -0.38% | 1.90x |
externs | clippy | incr-unchanged | 0.37% | 1.85x |
tt-muncher | clippy | incr-full | 0.35% | 1.77x |
issue-46449 | clippy | incr-patched: static str 6144 | 0.34% | 1.72x |
derive | clippy | incr-full | -0.32% | 1.61x |
tuple-stress | clippy | full | -0.32% | 1.59x |
Polish marramiau miau miau 🎉 meowmeowmeow Removing unused files Meow format Make lint level minimums parallel Fix author lint + stop using symbols Move from symbols to strings + fix things meow sync Remove crate-checking code Prepare for WIP PR opening m Turn back to visitor Add loadbearing lints + Move filtering code to lint_crate Move lint filter to check_mod Also handle lint groups + Should work
8c1bf71
to
b5ac9b5
Compare
I've done some more benchmarking, we have some very good numbers. I think this is ready to start looking into merging |
The code still contains sections of commented out code and has some unaddressed review comments. Would you mind doing a clean up pass over everything? |
66b4653
to
bdf21a4
Compare
@rustbot ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why all the clippy test changes?
Could you squash the commits, or splitting the history in 2 or 3 self-contained commits?
@@ -20,6 +20,7 @@ | |||
//! If you define a new `LateLintPass`, you will also need to add it to the | |||
//! `late_lint_methods!` invocation in `lib.rs`. | |||
|
|||
use std::default::Default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this in prelude?
// use std::fmt::Write; | ||
|
||
// hardwired lints from rustc_lint_defs | ||
// pub use rustc_session::lint::builtin::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented-out code.
@@ -439,7 +439,8 @@ declare_tool_lint! { | |||
pub rustc::UNTRANSLATABLE_DIAGNOSTIC, | |||
Deny, | |||
"prevent creation of diagnostics which cannot be translated", | |||
report_in_external_macro: true | |||
report_in_external_macro: true, | |||
[eval_always: true] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the syntax be closer to other modifiers?
let mut passes: Vec<_> = late_module_passes.iter().map(|mk_pass| (mk_pass)(tcx)).collect(); | ||
passes.push(Box::new(builtin_lints)); | ||
let pass = RuntimeCombinedLateLintPass { passes: &mut passes[..] }; | ||
let chained_box = Box::new(builtin_lints) as Box<dyn LateLintPass<'tcx>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let chained_box = Box::new(builtin_lints) as Box<dyn LateLintPass<'tcx>>; | |
let builtin_lints = Box::new(builtin_lints) as Box<dyn LateLintPass<'tcx>>; |
?
// .any(|lint| | ||
// !lints_that_dont_need_to_run.contains(&LintId::of(lint))) | ||
// }).collect(); | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented-out code.
@@ -218,7 +221,7 @@ macro_rules! declare_combined_early_lint_pass { | |||
|
|||
$v fn get_lints() -> $crate::LintVec { | |||
let mut lints = Vec::new(); | |||
$(lints.extend_from_slice(&$pass::get_lints());)* | |||
$(lints.extend_from_slice(&$pass::default().get_lints());)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$(lints.extend_from_slice(&$pass::default().get_lints());)* | |
$(lints.extend_from_slice(&$pass::get_lints());)* |
} | ||
impl $ty { | ||
pub fn get_lints() -> $crate::LintVec { vec![$($lint),*] } | ||
fn get_lints(&self) -> $crate::LintVec { vec![$($lint),*] } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still relevant.
fn get_lints(&self) -> LintVec { | ||
(**self).get_lints() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still relevant.
You'll see it when you get the warning.", | ||
version: Some("1.35.0"), | ||
location: "clippy_lints/src/cognitive_complexity.rs#L47", | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
use rustc_span::Span; | ||
|
||
/// Ensures that Constant-time Function Evaluation is being done (specifically, MIR lint passes). | ||
/// See rust-lang/rust#125116 for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are >100 comments in the PR. Could you summarize the question here?
Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had
#![allow]
ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:
#![allow]
ed in the whole crate,#[warn]
or similar, and its default level isAllow
As some lints need to run, this PR also adds loadbearing lints. On a lint declaration, you can use the
[loadbearing: true]
marker to label it as loadbearing. A loadbearing lint will never be filtered.Phase 1/2 Not all lints are being filtered, I'm still working on it, but this branch still gives us about a 2% improvement, so why not merge it already.
Fixes #106983