Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killing cargo test midway can leave target in unstable stage #7767

Closed
akhi3030 opened this issue Jan 6, 2020 · 6 comments · Fixed by #8087
Closed

Killing cargo test midway can leave target in unstable stage #7767

akhi3030 opened this issue Jan 6, 2020 · 6 comments · Fixed by #8087
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug

Comments

@akhi3030
Copy link

akhi3030 commented Jan 6, 2020

Problem

It appears that if you kill (with a SIGINT signal) a cargo test midway, it can leave the target directory in an unusable state. There are a bunch of empty files lying about which confuse subsequent cargo test` runs.

Subsequent cargo test runs fail with errors like these:

$ cargo test
...
error: could not execute process `.../target/debug/deps/client-0e2e62dd0088de51` (never executed)

Caused by:
  Permission denied (os error 13)

The error seems to be pointing to empty files in the directory. Manually deleting them makes the problem go away.

Possible Solution(s)

Maybe cargo test could try to catch the SIGINT signal and tidy up the directory before exiting.

Notes

Output of cargo version: 1.39.0

@akhi3030 akhi3030 added the C-bug Category: bug label Jan 6, 2020
@alexcrichton
Copy link
Member

Thanks for the report! Cargo works pretty hard to always be usable even after it's been killed. This definitely looks like a bug but pinning down exactly where this happens and what tool is responsible might be quite difficult. For example this could be the linker not handling sigint or something like that. The best way to fix this would basically be to scrutinize how Cargo handles files from rustc and look for possible errors.

@matklad
Copy link
Member

matklad commented Feb 5, 2020

I am seeing this pretty regularly.

I am not sure, but it seems like it always happens with a test binary. That is, I always' get "can't execute this binary", rather than something else failing horribly due to, say, an .rlib being an empty file.

And I think, because the file is in the ./deps folder, cargo per se has little to do with this: this file is written by rustc (or rather by the linker that is invoked by rustc).

@ehuss
Copy link
Contributor

ehuss commented Apr 8, 2020

I'm having a hard time trying to think of a way this could happen. Cargo compares the mtime of the sources to the mtime of the dep-info file. The dep-info file isn't written until after a successful build. Even if the linker is interrupted and leaves an incomplete output, the dep-info file should still be out-of-date.

@matklad or @akhi3030 can you provide some details such as:

  • OS
  • Rust toolchain (like x86_64-unknown-linux-gnu)
  • Type of filesystem (ext3, apfs, nfs, etc.)
  • Anything else unusual you can think of.

@ehuss ehuss added the A-rebuild-detection Area: rebuild detection and fingerprinting label Apr 8, 2020
@ehuss
Copy link
Contributor

ehuss commented Apr 8, 2020

Ah, I think I figured it out! I didn't consider how dependency mtime checking works with integration tests. Now to figure out a solution…

Here's a test:

#[cargo_test]
fn linking_interrupted() {
    // Interrupt during the linking phase shouldn't leave test executable as "fresh".

    let listener = TcpListener::bind("127.0.0.1:0").unwrap();
    let addr = listener.local_addr().unwrap();

    // Create a linker that we can interrupt.
    let linker = project()
        .at("linker")
        .file("Cargo.toml", &basic_manifest("linker", "1.0.0"))
        .file(
            "src/main.rs",
            &r#"
            use std::io::Read;

            fn main() {
                // Figure out the output filename.
                let output = match std::env::args().find(|a| a.starts_with("/OUT:")) {
                    Some(s) => s[5..].to_string(),
                    None => {
                        let mut args = std::env::args();
                        loop {
                            if args.next().unwrap() == "-o" {
                                break;
                            }
                        }
                        args.next().unwrap()
                    }
                };
                std::fs::remove_file(&output).unwrap();
                std::fs::write(&output, "").unwrap();
                // Tell the test that we are ready to be interrupted.
                let mut socket = std::net::TcpStream::connect("__ADDR__").unwrap();
                // Wait for the test to tell us to exit.
                let _ = socket.read(&mut [0; 1]);
            }
            "#
            .replace("__ADDR__", &addr.to_string()),
        )
        .build();
    linker.cargo("build").run();

    // Build it once so that the fingerprint gets saved to disk.
    let p = project()
        .file("src/lib.rs", "")
        .file("tests/t1.rs", "")
        .build();
    p.cargo("test --test t1 --no-run").run();
    // Make a change, start a build, then interrupt it.
    p.change_file("src/lib.rs", "// modified");
    let linker_env = format!(
        "CARGO_TARGET_{}_LINKER",
        rustc_host().to_uppercase().replace('-', "_")
    );
    let mut cmd = p
        .cargo("test --test t1 --no-run")
        .env(&linker_env, linker.bin("linker"))
        .build_command();
    let mut child = cmd
        .stdout(Stdio::null())
        .stderr(Stdio::null())
        .spawn()
        .unwrap();
    // Wait for linking to start.
    let mut conn = listener.accept().unwrap().0;

    // Interrupt the child.
    child.kill().unwrap();
    // Note: rustc and the linker are still running, let them exit here.
    conn.write(b"X").unwrap();

    // Build again, shouldn't be fresh.
    p.cargo("test --test t1")
        .with_stderr(
            "\
[COMPILING] foo [..]
[FINISHED] [..]
[RUNNING] `target/debug/deps/t1[..]`
",
        )
        .run();
}

EDIT: I think my original solution still works.

@pfrenssen
Copy link

I can confirm that the steps described in #8078 (comment) are exactly the ones that cause the issue for me. Interrupting the build before the linking phase is fine, but interrupting during the linking process leaves a 0-byte binary behind and the next cargo run throws the error.

@akhi3030
Copy link
Author

akhi3030 commented Apr 8, 2020

Thanks for the work investigating this folks! @ehuss, it does not look like you need any more information from me but let me know if you still need something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug
Projects
None yet
5 participants