Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

un-revert "Set RUSTC and RUSTDOC env for child processes run through the proxy" #3035

Open
rbtcollins opened this issue Jul 12, 2022 · 16 comments
Milestone

Comments

@rbtcollins
Copy link
Contributor

rbtcollins commented Jul 12, 2022

Problem you are trying to solve

We had some nice performance enhancements but hadn't caught an edge case well enough (which has led to a reversion). Lets support that and then re-enable the code

Solution you'd like

Robust automatic detection of different toolchain selection in recursive rustup proxy use. Nothing special needed by callers, except passing through variables

Notes

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

There are two broad approaches to take.

  1. This issue is not actually specific to rustup. The code using cargo +toolchain in nested invocations would be broken by anyone who set $RUSTC to a binary that's not a rustup proxy. Require people to change their code (with a blog post, documentation, better error, etc).
  2. Even though this is not really rustup's fault, in practice it causes too much breakage for us to land. Change rustup to be smarter about nested invocations somehow, in a way that allows you to set RUSTC manually but doesn't break people's existing code.

  1. is simple to understand and easy to implement and causes lots of breakage.
  2. is hard to understand and hard to implement (there are multiple ways to implement, but none will be simple) and avoids breakage. However, it runs the risk of having a "long tail" of issues where we find that rustup isn't quite smart enough, and we break nested invocations in some rare circumstance we haven't thought of.

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

there's also a third option which is just not to reland this and eat the performance hit. I would rather not take that approach.

@rbtcollins
Copy link
Contributor Author

My take on this, there are two layers: usability for folk building things with rustup in play, and robustness of implementation.

For folk building things with rustup, we can expect that the default position for everyone will be that they assume rustup is stateless, and are surprised when it isn't.

This is already an issue on e.g. Windows, where we literally copy a copy of cargo around in some custom toolchain cases (

// MAJOR HACKS: Copy cargo.exe to its own directory on windows before
)

So to my mind, we want the default behaviour to be safe: it should work, it shouldn't need changes in their code, and it should not result in bug reports or people creating gists to document how to make it work.

Then, we have a choice: is higher performance opt-in (do something special and wheee it gets faster), or can we make it higher performance by default.

Now the robustness of implementation thoughts:

We setup RUSTC etc based on toolchain; toolchain is based on CWD (rust-toolchain{.toml}), CLI parameters (+toolchain), environment variables (RUSTUP_TOOLCHAIN). If we run all those inputs through an HMAC, including the calculated value for any variables we set, and export the result as an environment variable, then we have the following cases (when a new toolchain is being selected in an inner proxy):

  1. process passes all variables through without modification, and sets toolchain itself somehow (e.g. +nightly, or perhaps RUSTUP_TOOLCHAIN).
  2. process has a whitelist of variables it passes through - RUSTC, PATH
  3. process strips all variables except for PATH/LD_LIBRARY_PATH

In case 1, we will be able to detect the change of toolchain and setup everything to work correctly
In case 2, we won't see our HMAC, but we will see RUSTC set. We could perhaps introspect that to see if its a known toolchain, and use that to trigger a warning or error or something. Either way, this is clearly a 'we thought about it and decided to do X', so in that case having to follow our protocol is reasonable. (add RUSTUP_VARIABLE_HASH or whatever to the whitelist)
In case 3, RUSTC won't be passed, the optimisation won't work, but it will consistently run the rustc for the requested toolchain.

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

There is a fourth case: process sets RUSTC itself. If this is different than the one rustup decided on, we'll detect the change and everything is ok (we'll keep the one the process chose). If it's the same as the one rustup decided on, we won't know and we'll overwrite it.

In practice I think it will be super rare to run RUSTC=$(rustup which rustc --toolchain $RUSTUP_TOOLCHAIN) cargo build or whatever, but it is worth noting.

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

I think if we're going to spend time on this we should also spend some time making an MCVE so we can test the behavior of our changes in different scenarios.

@rbtcollins
Copy link
Contributor Author

yes, agreed - thats what I meant by 'RUSTC might be pointing into one of our toolchains'.

MCVE?

@jyn514
Copy link
Member

jyn514 commented Jul 12, 2022

"minimum complete viable example" - just a way to reproduce this without spending 10 minutes compiling the projects in the original issue

@bstrie
Copy link

bstrie commented Jul 14, 2022

For folk building things with rustup, we can expect that the default position for everyone will be that they assume rustup is stateless, and are surprised when it isn't.

I think most people using rustup will be at least mildly aware that rustup has state, due to the prominence of the rustup default command. Perhaps this can be resolved by adding an opt-in setting that gets persisted to rustup's settings.toml? During new installs, the interactive rustup installer could even prompt to ask which behavior you want. Someday in the future, it might even be feasible to switch the default and make this opt-out rather than opt-in.

@jyn514
Copy link
Member

jyn514 commented Jul 14, 2022

I think adding a persistent setting for this seems reasonable, yeah.

I think most people using rustup will be at least mildly aware that rustup has state, due to the prominence of the rustup default command.

I expect people to know that rustup has state around installed and default toolchains, yes. But I think it's much more unlikely for people to know rustup has state within a single invocation of a rustup proxy.

@Manishearth
Copy link
Member

@jyn514 I suspect something with cargo-make that invokes a bash script that calls cargo +version build would be an MCVE, ideally one using some unstable stuff

@rbtcollins
Copy link
Contributor Author

re statefulness - I meant within a single invocation, not that there isn't state present at all. But even there - many users have not ever encountered 'default' - the installer takes care of that, and we've had to explain more than once that misconfigurations can be self corrected using that feature

@CAD97
Copy link

CAD97 commented Jul 25, 2022

(deleted; #3036)

@CAD97
Copy link

CAD97 commented Jul 25, 2022

One thing that might be desirable is to tell people to prefer rustup run toolchain cargo from automation tooling instead of cargo +toolchain, and then rustup run can handle fixing up mixed toolchain environment variables. This is essentially part of option 1 (require people to change their code), but combining handling the common case better with informing authors who want to be maximally resilient of a better option can perhaps get some of the benefit of both options.

@davidlattimore
Copy link

Does anyone here have any thoughts on fixing this by changing Cargo to invoke tools in the same directory as the cargo binary in preference to using PATH? rust-lang/cargo#10986

@CAD97
Copy link

CAD97 commented Aug 17, 2022

A promising approach: rust-lang/cargo#10986 (comment)

@CAD97
Copy link

CAD97 commented Aug 17, 2022

Because I just thought about it, there may be a fairly simple rustup-only solution: instead of setting $RUSTC, set e.g. $RUSTUP_RESOLVED_TOOLCHAIN1, with the rustup proxies reading that instead of doing the filesystem access and parsing. It's still not a zero overhead (as it would be to skip the proxies altogether), but IIUC it avoids a meaningful portion of the rustup cost if it can avoid parsing configs again.

Footnotes

  1. E.g. set directly to the toolchain bins folder.

ChrisDenton added a commit to ChrisDenton/rust that referenced this issue Aug 20, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc `@davidlattimore` `@bjorn3` `@rust-lang/cargo` `@rust-lang/compiler` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc `@davidlattimore` `@bjorn3` `@rust-lang/cargo` `@rust-lang/compiler` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ``@davidlattimore`` ``@bjorn3`` ``@rust-lang/cargo`` ``@rust-lang/compiler`` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ```@davidlattimore``` ```@bjorn3``` ```@rust-lang/cargo``` ```@rust-lang/compiler``` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ````@davidlattimore```` ````@bjorn3```` ````@rust-lang/cargo```` ````@rust-lang/compiler```` (sorry for the big ping list 😅)
@rami3l rami3l added this to the On Deck milestone Jun 10, 2024
@rami3l rami3l changed the title un-revert #2958 un-revert "Set RUSTC and RUSTDOC env for child processes run through the proxy" Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants