Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow self-profiler to only record potentially costly arguments when argument recording is turned on #95689

Merged
merged 6 commits into from
Apr 16, 2022

Conversation

lqd
Copy link
Member

@lqd lqd commented Apr 5, 2022

As discussed on zulip with @wesleywiser, I'd like to record proc-macro expansions in the self-profiler, with some detailed data (per-expansion spans for example, to follow #95473).

At the same time, I'd also like to avoid doing expensive things when tracking a generic activity's arguments, if they were not specifically opted into the event filter mask, to allow the self-profiler to be used in hotter contexts.

This PR tries to offer:

  • a way to ensure a closure to record arguments will only be called in that situation, so that potentially costly arguments can still be recorded when needed. With the additional requirement that, if possible, it would offer a way to record non-owned data without adding many generic_activity_with_arg_{...}-style methods. This lead to the generic_activity_with_arg_recorder single entry-point, and the closure parameter would offer the new methods, able to be executed in a context where costly argument could be created without disturbing the profiled piece of code.
  • some facilities/patterns allowing to record more rustc specific data in this situation, without making rustc_data_structures where the self-profiler is defined, depend on other rustc crates (causing circular dependencies): in particular, spans. They are quite tricky to turn into strings (if the default Debug impl output does not match the context one needs them for), and since I'd also like to avoid the allocation there when arg recording is turned off today, that has turned into another flexibility requirement for the API in this PR (separating the span-specific recording into an extension trait). edit: I've removed this from the PR so that it's easier to review, and opened self-profiler: record spans for proc-macro expansions #95739.
  • allow for extensibility in the future: other ways to record arguments, or additional data attached to them could be added in the future (e.g. recording the argument's name as well as its data).

Some areas where I'd love feedback:

  • the API and names: the EventArgRecorder and its method for example. As well as the verbosity that comes from the increased flexibility.
  • if I should convert the existing generic_activity_with_arg{s} to just forward to generic_activity_with_arg_recorder + recorder.record_arg (or remove them altogether ? Probably not): I've used the new API in the simple case I could find of allocating for an arg that may not be recorded, and the rest don't seem costly.
  • whether this API should panic if no arguments were recorded by the user-provided closure (like this PR currently does: it seems like an error to use an API dedicated to record arguments but not call the methods to then do so) or if this should just record a generic activity without arguments ?
  • whether the record_arg function should be #[inline(always)], like the generic_activity_* functions ?

As mentioned, r? @wesleywiser following our recent discussion.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2022
@rust-highfive
Copy link
Collaborator

Some changes occured to rustc_codegen_gcc

cc @antoyo

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 5, 2022
@lqd
Copy link
Member Author

lqd commented Apr 6, 2022

I've extracted the SpannedEventArgRecorder from this PR for easier review, and opened another one to add it back and record the spans at the same time.

lqd added 6 commits April 7, 2022 15:47
This allows profiling costly arguments to be recorded only when `-Zself-profile-events=args` is on: using a closure that takes an `EventArgRecorder` and call its `record_arg` or `record_args` methods.
and so that it doesn't allocate unless event argument recording is turned on
Copy link
Member

@wesleywiser wesleywiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks so much @lqd!

@wesleywiser
Copy link
Member

Since this changes the self-profiler, I'm marking this as rollup=never to avoid influencing perf results on other PRs.

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Apr 14, 2022

📌 Commit 1906b7e has been approved by wesleywiser

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 14, 2022
@bors
Copy link
Contributor

bors commented Apr 14, 2022

⌛ Testing commit 1906b7e with merge d0d7abca51138e91c1ab1cf1a2d6475d05d5b57a...

@Dylan-DPC
Copy link
Member

@bors retry yield

@bors
Copy link
Contributor

bors commented Apr 14, 2022

⌛ Testing commit 1906b7e with merge f943e04a586eb976ccfbd720e88ad8179ca814ab...

@bors
Copy link
Contributor

bors commented Apr 14, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 14, 2022
@Dylan-DPC
Copy link
Member

@bors retry

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 14, 2022
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Apr 14, 2022
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@rust-log-analyzer
Copy link
Collaborator

The job i686-msvc-1 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
Working directory is 'D:\a\rust\rust'
[command]"C:\Program Files\Git\bin\git.exe" version
git version 2.35.1.windows.2
##[endgroup]
Copying 'C:\Users\runneradmin\.gitconfig' to 'D:\a\_temp\a0b82286-110c-4bad-acd7-413a5be393bd\.gitconfig'
Temporarily overriding HOME='D:\a\_temp\a0b82286-110c-4bad-acd7-413a5be393bd' before making global git config changes
[command]"C:\Program Files\Git\bin\git.exe" config --global --add safe.directory D:\a\rust\rust
Deleting the contents of 'D:\a\rust\rust'
##[group]Initializing the repository
[command]"C:\Program Files\Git\bin\git.exe" init D:\a\rust\rust
---
Some tests failed in compiletest suite=ui-fulldeps mode=ui host=i686-pc-windows-msvc target=i686-pc-windows-msvc

error: test run failed!
status: exit code: 101
command: PATH="D:\a\rust\rust\build\i686-pc-windows-msvc\stage2\lib\rustlib\i686-pc-windows-msvc\lib;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22000.0\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\bin\HostX64\x86;D:\a\rust\rust\build\i686-pc-windows-msvc\stage0-bootstrap-tools\i686-pc-windows-msvc\release\deps;D:\a\rust\rust\build\i686-pc-windows-msvc\stage0\bin;D:\a\rust\rust\ninja;D:\a\rust\rust\msys2\mingw32\bin;C:\hostedtoolcache\windows\Python\3.10.4\x64\Scripts;C:\hostedtoolcache\windows\Python\3.10.4\x64;C:\msys64\usr\bin;D:\a\rust\rust\sccache;C:\Program Files\MongoDB\Server\5.0\bin;C:\aliyun-cli;C:\vcpkg;C:\cf-cli;C:\Program Files (x86)\NSIS;C:\tools\zstd;C:\Program Files\Mercurial;C:\hostedtoolcache\windows\stack\2.7.5\x64;C:\cabal\bin;C:\ghcup\bin;C:\tools\ghc-9.2.2\bin;C:\Program Files\dotnet;C:\mysql\bin;C:\Program Files\R\R-4.1.3\bin\x64;C:\SeleniumWebDrivers\GeckoDriver;C:\Program Files (x86)\sbt\bin;C:\Program Files (x86)\GitHub CLI;C:\Program Files\Git\bin;C:\Program Files (x86)\pipx_bin;C:\hostedtoolcache\windows\go\1.17.8\x64\bin;C:\hostedtoolcache\windows\Python\3.7.9\x64\Scripts;C:\hostedtoolcache\windows\Python\3.7.9\x64;C:\hostedtoolcache\windows\Ruby\2.5.9\x64\bin;C:\tools\kotlinc\bin;C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\8.0.322-6\x64\bin;C:\npm\prefix;C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\wbin;C:\ProgramData\kind;C:\Program Files\Eclipse Foundation\jdk-8.0.302.8-hotspot\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\ProgramData\Chocolatey\bin;C:\Program Files\Docker;C:\Program Files\PowerShell\7;C:\Program Files\Microsoft\Web Platform Installer;C:\Program Files\dotnet;C:\Program Files\Microsoft SQL Server\130\Tools\Binn;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\130\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn;C:\Program Files (x86)\Microsoft SQL Server\150\DTS\Binn;C:\Program Files\nodejs;C:\Program Files\OpenSSL\bin;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\ProgramData\chocolatey\lib\pulumi\tools\Pulumi\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files\CMake\bin;C:\ProgramData\chocolatey\lib\maven\apache-maven-3.8.5\bin;C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code;C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;C:\Program Files\GitHub CLI;C:\tools\php;C:\Program Files (x86)\sbt\bin;C:\SeleniumWebDrivers\ChromeDriver;C:\SeleniumWebDrivers\EdgeDriver;C:\Program Files\Amazon\AWSCLIV2;C:\Program Files\Amazon\SessionManagerPlugin\bin;C:\Program Files\Amazon\AWSSAMCLI\bin;C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin;C:\Program Files (x86)\Microsoft BizTalk Server;C:\Program Files\LLVM\bin;C:\Users\runneradmin\.dotnet\tools;C:\Users\runneradmin\.cargo\bin;C:\Users\runneradmin\AppData\Local\Microsoft\WindowsApps" "D:\\a\\rust\\rust\\build\\i686-pc-windows-msvc\\test\\ui-fulldeps\\stdio-from\\a.exe"
stdout: none
--- stderr -------------------------------
I/O error: operation failed to complete synchronously
thread 'main' panicked at 'assertion failed: child2.wait()?.success()', D:\a\rust\rust\src/test\ui-fulldeps\stdio-from.rs:50:5
------------------------------------------




failures:
    [ui] src/test\ui-fulldeps\stdio-from.rs

test result: FAILED. 65 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 26.21s

Build completed unsuccessfully in 0:36:31
make: *** [Makefile:72: ci-subset-1] Error 1
---
##[error]Process completed with exit code 2.
Post job cleanup.
[command]"C:\Program Files\Git\bin\git.exe" version
git version 2.35.1.windows.2
Copying 'C:\Users\runneradmin\.gitconfig' to 'D:\a\_temp\d573e6e1-108c-408a-9639-93a87f8b7aa0\.gitconfig'
Temporarily overriding HOME='D:\a\_temp\d573e6e1-108c-408a-9639-93a87f8b7aa0' before making global git config changes
[command]"C:\Program Files\Git\bin\git.exe" config --global --add safe.directory D:\a\rust\rust
[command]"C:\Program Files\Git\bin\git.exe" config --local --name-only --get-regexp core\.sshCommand
[command]"C:\Program Files\Git\bin\git.exe" submodule foreach --recursive "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
Entering 'library/backtrace'

@lqd
Copy link
Member Author

lqd commented Apr 15, 2022

@bors retry

@Dylan-DPC
Copy link
Member

(was already retried, just that rust log analyzer got delayed while posting it)

@lqd
Copy link
Member Author

lqd commented Apr 15, 2022

Yes I noticed this afterwards, a 4-hour delay to post results is … a bit slow.

@bors
Copy link
Contributor

bors commented Apr 15, 2022

⌛ Testing commit 1906b7e with merge afde28b1c3c27b86f8ef6b89c23c37489fa605c1...

@Dylan-DPC
Copy link
Member

@bors retry yield to rollup

@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@bors
Copy link
Contributor

bors commented Apr 16, 2022

⌛ Testing commit 1906b7e with merge febce1f...

@bors
Copy link
Contributor

bors commented Apr 16, 2022

☀️ Test successful - checks-actions
Approved by: wesleywiser
Pushing febce1f to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 16, 2022
@bors bors merged commit febce1f into rust-lang:master Apr 16, 2022
@rustbot rustbot added this to the 1.62.0 milestone Apr 16, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (febce1f): comparison url.

Summary:

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvements found
Regressions 😿
(primary)
Regressions 😿
(secondary)
Improvements 🎉
(primary)
Improvements 🎉
(secondary)
All 😿 🎉
(primary)
count1 0 0 0 3 0
mean2 N/A N/A N/A -1.1% N/A
max N/A N/A N/A -1.1% N/A

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. number of relevant changes

  2. the arithmetic mean of the percent change

@lqd lqd deleted the self-profiler branch April 16, 2022 16:47
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 22, 2022
self-profiler: record spans for proc-macro expansions

This PR is a follow-up to rust-lang#95473, using the arg recorder feature from rust-lang#95689:
- it adds support code to easily record spans in the event's arguments, when using `generic_activity_with_arg_recorder`.
- uses that to record the spans where proc-macro expansions happen in addition to their name.

As for the other 2 PRs, the goal here is to provide visibility into proc-macro expansion performance, so that users can diagnose which uses of proc-macros in their code could be causing compile time issues.

Some areas where I'd love feedback:
- [x] the API and names: the `SpannedEventArgRecorder` trait and its method, much like rust-lang#95689 had the same question about the `EventArgRecorder` naming
- [x] we don't currently have a way to record the names of the event arguments, so should `record_arg_spanned` record the span as "location: {}" or similar ?
antoyo pushed a commit to antoyo/rust that referenced this pull request Jun 7, 2022
Allow self-profiler to only record potentially costly arguments when argument recording is turned on

As discussed [on zulip](https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Identifying.20proc-macro.20slowdowns/near/277304909) with `@wesleywiser,` I'd like to record proc-macro expansions in the self-profiler, with some detailed data (per-expansion spans for example, to follow rust-lang#95473).

At the same time, I'd also like to avoid doing expensive things when tracking a generic activity's arguments, if they were not specifically opted into the event filter mask, to allow the self-profiler to be used in hotter contexts.

This PR tries to offer:
- a way to ensure a closure to record arguments will only be called in that situation, so that potentially costly arguments can still be recorded when needed. With the additional requirement that, if possible, it would offer a way to record non-owned data without adding many `generic_activity_with_arg_{...}`-style methods. This lead to the `generic_activity_with_arg_recorder` single entry-point, and the closure parameter would offer the new methods, able to be executed in a context where costly argument could be created without disturbing the profiled piece of code.
- some facilities/patterns allowing to record more rustc specific data in this situation, without making `rustc_data_structures`  where the self-profiler is defined, depend on other rustc crates (causing circular dependencies): in particular, spans. They are quite tricky to turn into strings (if the default `Debug` impl output does not match the context one needs them for), and since I'd also like to avoid the allocation there when arg recording is turned off today, that has turned into another flexibility requirement for the API in this PR (separating the span-specific recording into an extension trait). **edit**: I've removed this from the PR so that it's easier to review, and opened rust-lang#95739.
- allow for extensibility in the future: other ways to record arguments, or additional data attached to them could be added in the future (e.g. recording the argument's name as well as its data).

Some areas where I'd love feedback:
- the API and names: the `EventArgRecorder` and its method for example. As well as the verbosity that comes from the increased flexibility.
- if I should convert the existing `generic_activity_with_arg{s}` to just forward to `generic_activity_with_arg_recorder` + `recorder.record_arg` (or remove them altogether ? Probably not): I've used the new API in the simple case I could find of allocating for an arg that may not be recorded, and the rest don't seem costly.
- [x] whether this API should panic if no arguments were recorded by the user-provided closure (like this PR currently does: it seems like an error to use an API dedicated to record arguments but not call the methods to then do so) or if this should just record a generic activity without arguments ?
- whether the `record_arg` function should be `#[inline(always)]`, like the `generic_activity_*` functions ?

As mentioned, r? `@wesleywiser` following our recent discussion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants