Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local-only preprocessor cache AKA "direct mode" #1882

Merged
merged 15 commits into from
Oct 30, 2023

Conversation

Alphare
Copy link
Contributor

@Alphare Alphare commented Sep 11, 2023

Relates to
#219 #160 #497

@sylvestre
Copy link
Collaborator

@Alphare could you please write a trivial PR (typo fix, spaces, etc?) so that I don't have to approve the CI runs in the future? :)
thanks

@Alphare
Copy link
Contributor Author

Alphare commented Oct 11, 2023

@Alphare could you please write a trivial PR (typo fix, spaces, etc?) so that I don't have to approve the CI runs in the future? :) thanks

Ah sorry, not used to working with Github too much, Mozilla's workflow even less. Will do quickly, thanks for the heads up. :)

@codecov-commenter
Copy link

codecov-commenter commented Oct 11, 2023

Codecov Report

Attention: 821 lines in your changes are missing coverage. Please review.

Comparison is base (e911999) 29.70% compared to head (738ad92) 29.81%.

❗ Current head 738ad92 differs from pull request most recent head 194b7dd. Consider uploading reports for the commit 194b7dd to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1882      +/-   ##
==========================================
+ Coverage   29.70%   29.81%   +0.10%     
==========================================
  Files          50       51       +1     
  Lines       18022    19128    +1106     
  Branches     8703     9237     +534     
==========================================
+ Hits         5354     5703     +349     
- Misses       7491     7960     +469     
- Partials     5177     5465     +288     
Files Coverage Δ
src/compiler/diab.rs 49.86% <0.00%> (-0.14%) ⬇️
src/compiler/msvc.rs 46.91% <0.00%> (-0.16%) ⬇️
src/compiler/nvcc.rs 41.02% <0.00%> (-0.16%) ⬇️
src/compiler/nvhpc.rs 33.83% <0.00%> (-0.26%) ⬇️
src/compiler/tasking_vx.rs 43.52% <0.00%> (-0.12%) ⬇️
src/compiler/clang.rs 53.74% <25.00%> (-0.22%) ⬇️
src/test/tests.rs 34.86% <50.00%> (+1.53%) ⬆️
src/cmdline.rs 15.54% <0.00%> (-0.44%) ⬇️
src/compiler/rust.rs 32.07% <33.33%> (-0.09%) ⬇️
src/config.rs 28.28% <16.66%> (-1.70%) ⬇️
... and 8 more

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Alphare Alphare force-pushed the direct-mode branch 5 times, most recently from f015f48 to 8b0671b Compare October 11, 2023 16:28
@Alphare
Copy link
Contributor Author

Alphare commented Oct 11, 2023

Aside from a couple of grunt-work TODOs, this should be good enough to review. I'd love to get some feedback while I address those + add the user-level docs so that we don't get too much round-trip lag. :)

@Alphare Alphare changed the title WIP add local-only direct mode Add local-only direct mode Oct 11, 2023
@Alphare Alphare marked this pull request as ready for review October 11, 2023 16:46
src/compiler/c.rs Outdated Show resolved Hide resolved
src/config.rs Outdated
};

if env::var("SCCACHE_DIRECT").is_ok() && env::var("SCCACHE_NODIRECT").is_ok() {
warn!("Both `SCCACHE_DIRECT` && `SCCACHE_NODIRECT` are defined");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
warn!("Both `SCCACHE_DIRECT` && `SCCACHE_NODIRECT` are defined");
warn!("Conflicting options: Both `SCCACHE_DIRECT` && `SCCACHE_NODIRECT` are defined");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho we should prefer a single env var name and use its value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. I don't think accepted boolean values are unified across sccache. Usually (1, on, true, 0, off, false), but maybe we want to be stricter (and complain when we don't get a valid value)?

Copy link
Collaborator

@drahnr drahnr Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a clap extension / or impl. to handle the different variants which should be used for all of these features long term, but that'd be a breaking change. Adding new features / envs being interpreted should assume that uniform version https://docs.rs/clap/latest/clap/builder/struct.BoolishValueParser.html , at least that would simplify behavior.

@@ -0,0 +1,743 @@
# 0 "tests/test.c"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to have a clang preproc file too ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this particular test, preprocessor output was identical. I'm not exactly sure how complex the test would need to grow to get a different preprocessor output. Does anyone have an example? I'll try to look into it if not.

@Xuanwo
Copy link
Collaborator

Xuanwo commented Oct 13, 2023

Is it possible to make direct mode work of storage backend like s3 and gcs?

docs/Configuration.md Outdated Show resolved Hide resolved
docs/Configuration.md Outdated Show resolved Hide resolved
docs/Configuration.md Outdated Show resolved Hide resolved
docs/Local.md Outdated Show resolved Hide resolved
src/compiler/manifest.rs Outdated Show resolved Hide resolved
src/compiler/manifest.rs Outdated Show resolved Hide resolved
src/util.rs Show resolved Hide resolved
Copy link
Collaborator

@drahnr drahnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code quality and documentation is very good! There is a range of minor nits that should be addressed before merging.
Thank you for your PR!

src/util.rs Outdated
/// plus MAX_HAYSTACK_LEN bytes of the previous chunk, to account for
/// the possibility of partial reads splitting a time macro
/// across two calls.
previous_small_read: Option<Vec<u8>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes should go in the previous commit rather than this one. I wonder if it wouldn't be simpler to use a single buffer instead of overlap_buffer + previous_small_read. (I'll also note an Option<Vec<T>> is kind of pointless if you don't expect there to be a Some(vec![]).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes should go in the previous commit rather than this one.

Good catch.

I wonder if it wouldn't be simpler to use a single buffer instead of overlap_buffer + previous_small_read.

IMO it would make the logic harder to read.

(I'll also note an Option<Vec<T>> is kind of pointless if you don't expect there to be a Some(vec![]).

True. While it doesn't matter in terms of performance, I'll switch to just a Vec<u8>, should make the code simpler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would make the logic harder to read.

I'm not sure it would. I'm having a hard time convincing myself that the current one is correct with all the dancing around between both buffers.

@@ -173,7 +173,7 @@ impl Storage for DiskCache {
) -> Result<()> {
let key = normalize_key(key);
let mut buf = vec![];
preprocessor_cache_entry.write(&mut buf)?;
preprocessor_cache_entry.serialize_to(&mut buf)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that you do a mix of amends and additional commits to fixup things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, I usually always to amends, but I was starting to take too much time to do history editing. I should probably stick to new commits in this case, unless otherwise asked (like with the time macros thing above).

@@ -435,6 +437,11 @@ where
},
};

too_hard_for_direct_mode = match arg.flag_str() {
Some(s) => s == "-Xpreprocessor",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add -Wp in ARGS. Probably something like take_arg!("-Wp", OsString, Concatenated(','), PreprocessorArgument)

I guess you could put this check for -Xpreprocessor/-Wp where ProcessorArgument are handled, rather than for all args. Then there's the question about the second loop for Xclangs, and, how @-files are handled with direct mode.

@@ -560,7 +560,7 @@ mod test {
let mut buf = vec![0; HASH_BUFFER_SIZE * 2];
// Make the pattern overlap two buffer chunks to make sure we account for this
let start = HASH_BUFFER_SIZE - MAX_TIME_MACRO_HAYSTACK_LEN / 2;
buf[start..start + b"__TIMESTAMP__".len()].copy_from_slice(b"__TIMESTAMP__");
buf[start..][..b"__TIMESTAMP__".len()].copy_from_slice(b"__TIMESTAMP__");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could me amended into the commit where you add these tests.

@@ -884,7 +884,7 @@ fn remember_include_file(

let original_path = path;
// Canonicalize path for comparison; Clang uses ./header.h.
if path.starts_with(b"./") {
if path.starts_with(b"./") || path.starts_with(br".\") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory a ".\foo" file could literally exist on Unix, and that would break in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll special case this for Windows, thanks.

@sylvestre sylvestre mentioned this pull request Oct 24, 2023
@Alphare
Copy link
Contributor Author

Alphare commented Oct 24, 2023

(For some reason Github doesn't want me replying to this comment, so I'll do it here)

We should probably add -Wp in ARGS. Probably something like take_arg!("-Wp", OsString, Concatenated(','), PreprocessorArgument)

I guess you could put this check for -Xpreprocessor/-Wp where ProcessorArgument are handled, rather than for all args.

Sounds good

Then there's the question about the second loop for Xclangs,

Looking at ccache's code, Xclangs are only handled for PCH stuff that we don't cover in sccache. Nothing that implies direct mode anyway.

and, how @-files are handled with direct mode.

IIUC, sccache expands @-file arguments at the start, so it shouldn't make any difference?

@Alphare Alphare force-pushed the direct-mode branch 2 times, most recently from da230ae to 6dbf004 Compare October 24, 2023 12:14
@Alphare
Copy link
Contributor Author

Alphare commented Oct 24, 2023

This last push should fix the clippy issue. I think I've addressed your comments.

// Canonicalize path for comparison; Clang uses ./header.h.
#[cfg(windows)]
{
if path.starts_with(br".\") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fun thing about windows is that it can also use forward slash as a path separator.

| Some(PreprocessorArgumentPath(_)) => &mut preprocessor_args,
| Some(PreprocessorArgumentPath(_)) => {
too_hard_for_preprocessor_cache_mode = match arg.flag_str() {
Some(s) => s == "-Xpreprocessor" || s == "-Wp",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it would make a huge difference, but both of these are PreprocessorArgument, you could put the check behind PreprocessorArgument rather than all the PreprocessorArgument*.

@@ -2054,5 +2056,12 @@ mod test {
o => panic!("Got unexpected parse result: {:?}", o),
};
assert!(parsed_args.too_hard_for_preprocessor_cache_mode);

let args = stringvec!["-c", "foo.c", "-o", "foo.o", r#"-Wp,-DFOO="'something'""#];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the single quotes.

src/util.rs Outdated
/// plus MAX_HAYSTACK_LEN bytes of the previous chunk, to account for
/// the possibility of partial reads splitting a time macro
/// across two calls.
previous_small_read: Option<Vec<u8>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would make the logic harder to read.

I'm not sure it would. I'm having a hard time convincing myself that the current one is correct with all the dancing around between both buffers.

src/util.rs Outdated
} else {
self.previous_small_read = visit.to_owned();
}
self.find_macros(&self.previous_small_read.to_owned());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless to_owned?

src/util.rs Outdated
.copy_from_slice(visit);

// Check both the concatenation with the previous small read
self.find_macros(&self.previous_small_read.to_owned());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless to_owned

src/util.rs Outdated
// Check both the concatenation with the previous small read
self.find_macros(&self.previous_small_read.to_owned());
// ...and the overlap buffer
self.find_macros(&self.overlap_buffer.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless clone

src/util.rs Outdated
let right_half = visit.len() - MAX_HAYSTACK_LEN;
self.overlap_buffer[..MAX_HAYSTACK_LEN].copy_from_slice(&visit[right_half..]);
}
self.find_macros(&self.overlap_buffer.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless clone

src/util.rs Outdated
// Copy the left side of the visit to the right of the buffer
let left_half = MAX_HAYSTACK_LEN;
self.overlap_buffer[left_half..].copy_from_slice(&visit[..left_half]);
self.find_macros(&self.overlap_buffer.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless clone

@sylvestre
Copy link
Collaborator

Build failed here:

   Compiling sccache v0.6.0 (/home/runner/work/sccache/sccache)
error[E0433]: failed to resolve: use of undeclared crate or module `io`
   --> src/compiler/c.rs:842:51
    |
842 |     fn metadata(&self, path: impl AsRef<Path>) -> io::Result<PreprocessorFileMetadata> {
    |                                                   ^^ use of undeclared crate or module `io`
    |
help: a builtin type with a similar name exists
    |
842 |     fn metadata(&self, path: impl AsRef<Path>) -> i8::Result<PreprocessorFileMetadata> {
    |                                                   ~~
help: consider importing one of these items
    |
15  + use futures::io;
    |
15  + use std::io;
    |
15  + use tokio::io;
    |

This will be useful in a later commit to persist file paths using
the same binary format on all platforms.

This also introduces a decoding operation, which adapts the code from
the original library ever so slightly to be less likely to provoke UB
from a future change. Instead of:

```
// Convert to UTF-16
let mut wstr: Vec<u16> = Vec::with_capacity(len as usize);
wstr.set_len(len as usize);
let len = winapi::um::stringapiset::MultiByteToWideChar(
    codepage,
    flags,
    multi_byte_str.as_ptr() as winapi::um::winnt::LPSTR,
    multi_byte_str.len() as i32,
    wstr.as_mut_ptr(),
    len,
);
```

Do:
```
// Convert to UTF-16
let mut wstr: Vec<u16> = Vec::with_capacity(len as usize);
let len = winapi::um::stringapiset::MultiByteToWideChar(
    codepage,
    flags,
    multi_byte_str.as_ptr() as winapi::um::winnt::LPSTR,
    multi_byte_str.len() as i32,
    wstr.as_mut_ptr(),
    len,
);
wstr.set_len(len as usize);

Which moves the `set_len` to *after* the Vec has been initialized.
```
This will be useful in a future commit to compare timestamps with
as much precision as possible while also handling obviously
fake/corrupted times.
This will be useful to handle time macros properly during caching
preprocessor output in a future commit, while also not reading the
contents twice.
`ccache` has a direct mode¹ by default, which caches the preprocessor
step whenever possible, which speeds up local computation considerably.

This only applies to C/C++ compilers and local storage.

[1] https://ccache.dev/manual/4.3.html#_the_direct_mode
This makes testing much easier and the structure of the overall
function clearer, at the cost of some added complexity.
This stops using a dummy `TimeMacroFinder` and just separates the logic
between the two modes.
This is not technically required to work, but will make our cache
more reproducible.
It's more idiomatic in the Rust world, and makes it more obvious.
We define the `-Wp,*` argument for GCC/Clang, disable direct mode
if it's present, since it's too hard to handle.

`ccache` allows `-Wp,-MD,path`, `-Wp,-MMD,path` and `-Wp,-D_define_`.
We don't handle these for now, since they need more careful attention.
@sylvestre sylvestre merged commit 08aae37 into mozilla:main Oct 30, 2023
37 of 38 checks passed
@sylvestre
Copy link
Collaborator

well done @Alphare!
We can fix the rest in new pr

@sylvestre
Copy link
Collaborator

@robertmaynard if you have a chance to test it before the release, don't hesitate

@@ -84,6 +99,7 @@ configuration variables

* `SCCACHE_DIR` local on disk artifact cache directory
* `SCCACHE_CACHE_SIZE` maximum size of the local on disk cache i.e. `2G` - default is 10G
* `SCCACHE_PREPROCESSOR_MODE` enable/disable preprocessor caching (see [the local doc](Local.md))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be "SCCACHE_DIRECT" no ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants