Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious error in time::tests::instant_math on Windows #56034

Closed
kennytm opened this issue Nov 18, 2018 · 0 comments
Closed

Spurious error in time::tests::instant_math on Windows #56034

kennytm opened this issue Nov 18, 2018 · 0 comments
Labels
A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Comments

@kennytm
Copy link
Member

kennytm commented Nov 18, 2018

Symptom: On Windows the time::tests::instant_math test panicked with

thread '<unnamed>' panicked at 'assertion failed: a - Duration::new(0, 100) <= b', libstd\time.rs:507:9

First happened #55672 (comment). Currently being debugged via #56017.

@kennytm kennytm added A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Nov 18, 2018
alexcrichton added a commit to alexcrichton/rust that referenced this issue Nov 19, 2018
Previously this threshold when testing was 100ns, but the Windows
documentation states:

> which is a high resolution (<1us) time stamp

which presumably means that we could have up to 1us resolution, which
means that 100ns doesn't capture "equivalent" time intervals due to
various bits of rounding here and there.

It's hoped that this..

Closes rust-lang#56034
kennytm added a commit to pietroalbini/rust that referenced this issue Nov 19, 2018
Increase `Duration` approximate equal threshold to 1us

Previously this threshold when testing was 100ns, but the Windows
documentation states:

> which is a high resolution (<1us) time stamp

which presumably means that we could have up to 1us resolution, which
means that 100ns doesn't capture "equivalent" time intervals due to
various bits of rounding here and there.

It's hoped that this..

Closes rust-lang#56034
pietroalbini pushed a commit to pietroalbini/rust that referenced this issue Nov 19, 2018
Previously this threshold when testing was 100ns, but the Windows
documentation states:

> which is a high resolution (<1us) time stamp

which presumably means that we could have up to 1us resolution, which
means that 100ns doesn't capture "equivalent" time intervals due to
various bits of rounding here and there.

It's hoped that this..

Closes rust-lang#56034
ischeinkman pushed a commit to ischeinkman/libnx-rs-std that referenced this issue Dec 20, 2018
Previously this threshold when testing was 100ns, but the Windows
documentation states:

> which is a high resolution (<1us) time stamp

which presumably means that we could have up to 1us resolution, which
means that 100ns doesn't capture "equivalent" time intervals due to
various bits of rounding here and there.

It's hoped that this..

Closes rust-lang#56034
Centril added a commit to Centril/rust that referenced this issue Jan 24, 2019
Fix Instant/Duration math precision & associativity on Windows

**tl;dr** Addition and subtraction on Duration/Instant are not associative on windows because we use the perfcounter frequency in every calculation instead of just when we measure time.

This is my first contrib (PR or Issue) to Rust, so please lmk if I've done this wrong. I followed CONTRIBUTING to the extent I could given my system doesn't seem to be able to build the compiler with changes in the source tree. I also asked about this issue in #rust-beginners a week or so ago, before digging through libstd -- I'm unsure if there's a good way to follow up on that, but I'd be happy to update the docs on the timing structs if this fixes the problem.

## Issue

The `Duration` type keeps seconds in the upper-64 and nanoseconds in the lower-32 bits. In theory doing math on these ought to be basically the same as doing math on any other 64 or 32 bit integral number.

On windows (and I think macos too), however, our math gets messy because the Instant type stores the current point in time in units of HPET Performance Counter counts, not nanoseconds, and does unit conversions on every math operation, rather than just when we measure the time from the system clock.

I tried this code:

```
use std::time::{Duration, Instant};

fn main() {
    let now = Instant::now();
    let offset = Duration::from_millis(5);
    assert_eq!((now + offset) - now, (now - now) + offset);
}
```

On UNIX machines (linux and macos) it behaves as you'd expect -- [no crash](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cf2206c0b7e07d8ecc7767a512364094).

On Windows hosts, however, it blows up because of a precision problem in the Instant +/- Duration math:

```
C:\Users\aberg\work\timetest (master -> origin)
λ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.02s
     Running `target\debug\timetest.exe`
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `4.999914ms`,
 right: `5ms`', src\main.rs:6:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: process didn't exit successfully: `target\debug\timetest.exe` (exit code: 101)

C:\Users\aberg\work\timetest (master -> origin)
λ cat src\main.rs
use std::time::{Duration, Instant};

fn main() {
    let now = Instant::now();
    let offset = Duration::from_millis(5);
    assert_eq!((now + offset) - now, (now - now) + offset);
}
```

On windows I think this is a consequence of doing the HPET-PerfCounter-Unit conversion on each math operation. I suspect the reason it works on macs is that (from what I could find online) most apple machines report timing in nanoseconds anyway. For anyone interested, the equivalent functions on macos, with a little work to fish out the numerator/denominator from a timebase struct:

* `QueryPerformanceCounter()` -> `mach_absolute_time()`
* `QueryPerformanceFrequency()` -> `mach_timebase_info()`

If this PR ends up working as I expect it to when CI runs the tests, I can make the same changes to the macos implementation.

## Potential Fix

We ought to be able to sort this out by storing nanoseconds, rather than PerfCounter units, that way intermediate math is done in the most precise units we support and we're only doing unit conversions when we actually measure the system clock (and it might even translate to a small perf gain for people doing tons of Instant/Duration math).

I believe this will address the underlying cause of rust-lang#56034, and make the guessed epsilon constant from rust-lang#56059 unnecessary. If it's of interest, I can write up how these timing types work on the tier 1 platforms to address rust-lang#32626 as well, since I'm already in here figuring it out.

## This Patch

To that end, I've got this patch, which I think should fix it on windows, but I'm having trouble testing it -- any time I change anything in libstd I start getting this error, which no amount of clean building seems to resolve:

```
C:\Users\aberg\work\rust (master -> origin)
λ python x.py test --stage 0 --no-doc src/libstd
Updating only changed submodules
Submodules updated in 0.27 seconds
    Finished dev [unoptimized] target(s) in 2.41s
Building stage0 std artifacts (x86_64-pc-windows-msvc -> x86_64-pc-windows-msvc)
    Finished release [optimized] target(s) in 6.78s
Copying stage0 std from stage0 (x86_64-pc-windows-msvc -> x86_64-pc-windows-msvc / x86_64-pc-windows-msvc)
Building stage0 test artifacts (x86_64-pc-windows-msvc -> x86_64-pc-windows-msvc)
   Compiling test v0.0.0 (C:\Users\aberg\work\rust\src\libtest)
error[E0460]: found possibly newer version of crate `std` which `getopts` depends on
  --> src\libtest\lib.rs:36:1
   |
36 | extern crate getopts;
   | ^^^^^^^^^^^^^^^^^^^^^
   |
   = note: perhaps that crate needs to be recompiled?
   = note: the following crate versions were found:
           crate `std`: \\?\C:\Users\aberg\work\rust\build\x86_64-pc-windows-msvc\stage0-sysroot\lib\rustlib\x86_64-pc-windows-msvc\lib\libstd-d7a80ca2ae113c97.rlib
           crate `std`: \\?\C:\Users\aberg\work\rust\build\x86_64-pc-windows-msvc\stage0-sysroot\lib\rustlib\x86_64-pc-windows-msvc\lib\std-d7a80ca2ae113c97.dll
           crate `getopts`: \\?\C:\Users\aberg\work\rust\build\x86_64-pc-windows-msvc\stage0-test\x86_64-pc-windows-msvc\release\deps\libgetopts-ae40a96de5f5d144.rlib

error: aborting due to previous error

For more information about this error, try `rustc --explain E0460`.
error: Could not compile `test`.

To learn more, run the command again with --verbose.
command did not execute successfully: "C:\\Users\\aberg\\work\\rust\\build\\x86_64-pc-windows-msvc\\stage0\\bin\\cargo.exe" "build" "--target" "x86_64-pc-windows-msvc" "-j" "12" "--release" "--manifest-path" "C:\\Users\\aberg\\work\\rust\\src/libtest/Cargo.toml" "--message-format" "json"
expected success, got: exit code: 101
failed to run: C:\Users\aberg\work\rust\build\bootstrap\debug\bootstrap test --stage 0 --no-doc src/libstd
Build completed unsuccessfully in 0:00:20
```

---

Since you wrote the linked PRs and might remember looking at related problems:

r? @alexcrichton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-spurious Area: Spurious failures in builds (spuriously == for no apparent reason) C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

1 participant