Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify: int->float and f32->f64 round to nearest, overflow to infinity #62231

Closed
hanna-kruppe opened this issue Jun 29, 2019 · 23 comments
Closed
Assignees
Labels
A-ffi Area: Foreign Function Interface (FFI) disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jun 29, 2019

As suggested by @Centril, this issue is for getting @rust-lang/lang approval for rust-lang/reference#607 (Note: will update that patch to explicitly talk about overflow behavior, once approved here)

cc #62175

Detailed proposal

The following casts:

  • (expr: iM) as fN
  • (expr: uM) as fN
  • (expr: f64) as f32

all can see values that cannot be represented losslessly in the destination type, e.g.:

  • i32::MAX is not exactly representable as f32 (though it is less than f32::MAX, so it can be rounded to a finite float)
  • u128::MAX is larger than f32::MAX, so it cannot even be rounded to a nearby finite float (it "overflows")
  • f64 obviously has more precision than f32, so this cast has to round (infinities and NaNs can be carried over to f32, so this is only about finite numbers)

These cases are not unsound any more (see #15536), but currently the rounding mode is explicitly "unspecified" in the reference. What happens on overflow (e.g. u128::MAX as f32) is also not specified explicitly anywhere AFAICT. I proposed to define these casts as follows:

  • rounding mode: to nearest, ties to even (see IEEE 754-2008 §4.3.1)
  • on overflow: return infinity (with the sign of the source)

Rationale

Leaving the behavior of this core part of the language unspecified only hinders Rust programmers in using them and causes uncertainty and (de jure) non-portability. They should be defined as something, and I argue (round to nearest, overflow to infinity) is the most obvious, most consistent, and most useful choice:

  • IEEE 754-2008 prescribes round-to-nearest as the default rounding mode and returning (+/-) infinity as the default behavior on overflow (see §4.3.3, §7.4 in IEEE 754-2008).
  • This rounding mode and overflow behavior gives the closest possible result (which is of course why IEEE 754 makes it the default).
  • Other operations in Rust (e.g., string parsing) and conversions in other languages (e.g., Java) also default to these behaviors.
  • LLVM and hardware support this rounding mode and overflow handling natively, so it is also efficient.
@hanna-kruppe hanna-kruppe added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jun 29, 2019
@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 29, 2019

IEEE 7.2 mentions:

The overflow exception shall be signaled if and only if the destination format’s largest finite number is exceeded in magnitude by what would have been the rounded floating-point result (see 4) were the exponent range unbounded.

In addition, under default exception handling for overflow, the overflow flag shall be raised and the inexact exception shall be signaled.

So I wonder whether "panicking on overflow" (e.g. when debug-assertions=on) was considered as a potential alternative, and if so, what were the tradeoffs.

@hanna-kruppe
Copy link
Contributor Author

hanna-kruppe commented Jun 29, 2019

Here's some reasons for not panicking:

  • Panics are entirely unprecedented for as casts (see also discussion in floating point to integer casts can cause undefined behaviour #10184).
  • We'd need to add a new API for the conversion that allows people to choose infinity-on-overflow when they want it.
  • Arguably, if this conversion errors on overflow, it should do so by returning Result, not panicking. Of course, that's not possible for as.
  • Generally as casts are for unchecked, potentially quite lossy casts (consider casting to from larger to smaller integer types, which doesn't ever check for overflow), so it would be weird to start checking in this one particular case.
  • Panicking is arguably incompatible with the current wording in the reference, which implies that rounding happens, and overflow -> inf is arguably rounding.
  • It would slow down an operation that is probably common in some performance sensitive code paths.

I should also note that IEEE 754 cannot possibly serve as justification for such behavior, despite the suggestive quote.

In the context of that standard, "exception" and "signaling an exception" does not imply trapping, aborting, or otherwise diverting execution flow, but simply refers to an exceptional circumstance that by default is handled by supplying a sensible result (rounded result in the case of inexact, returning an infinity in the case of overflow).

Other handling of these exceptions is possible, including diverting execution flow, but as far as IEEE 754 is concerned, there is no distinction between exceptions raised from e.g. casts versus arithmetic. So in this framework, if you want to panic on overflow, it should happen on every result that overflows, including e.g. exp(HUGE_VAL), f64::MAX * f64::MAX, and "1e1000".parse::<f64>().unwrap(). Needless to say, such a sweeping change to the default behavior is neither desirable as nor backwards compatible.

The overflow flag that is to be raised refers to floating point exception flags, which we do not currently expose in any way in Rust (and this too would be a much broader discussion than about just casts).

@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 29, 2019

I can't find the second part of your quote anywhere in the standard (and certainly not in §7.3), but here's some reasons for not panicking:

I am reading it from here, "Section 7.4 Overflow" (pages 37-38). The second part is the last sentence of Section 7.4 which comes after the page break.

In the context of that standard, "exception" and "signaling an exception" does not imply trapping, aborting, or otherwise diverting execution flow, but simply refers to an exceptional circumstance that by default is handled by supplying a sensible result (rounded result in the case of inexact, returning an infinity in the case of overflow).

That clarifies some of the doubts I had. Section 7.2 states that invalid operations are signaled using a NaN, so I (incorrectly) supposed that, when Section 7.4 said "signaling", a NaN would be returned, but then that same section states a default result is returned (+-INF), so I supposed that overflow would be signaled in some other way, e.g., by setting an overflow flag in some register that one can check. Otherwise one can't know whether overflow happen - e.g. if f64 as f32 returns +INF, did overflow happened because, e.g., f64 == f64::MAX, or was f64 == +INF in which case no overflow happened?

@hanna-kruppe
Copy link
Contributor Author

Oops,I just noticed I've pulled up the 1985 version of the standard today, sorry for the confusion. I see the wording you quoted now (but my explanation for why it doesn't support panicking still holds). This also resulted in the references in the top comment being wrong, corrected that now.

Section 7.2 states that invalid operations are signaled using a NaN

NB: the correct standardese is: by default, when an inexact operation is signaled, the operation returns a quiet NaN.

so I supposed that when Section 7.4 said "signaling" a NaN would be returned

It doesn't. It signals an inexact exception in the default handling, but that has nothing to do with returning NaNs. A "signaling NaN" is just a kind of NaN that causes some operations to signal an invalid operation exception when they see it, in contrast to quiet NaNs that usually don't cause that (see §6.2).

I supposed overflow would be signaled in some other way, e.g. by setting an overflow flag in some register that one can check. Otherwise one can't know whether overflow happen - e.g. if f64 as f32 returns +INF, did overflow happened because, e.g., f64 == f64::MAX, or was f64 == +INF?

The overflow flag is raised (set) as part of the default exception handling. But as I mentioned (and, I think, discussed with you in the past), Rust does not currently support users reading the flags, so they effectively don't exist in Rust. I'd like to change this eventually but again, far out of scope (and currently infeasible to implement due to LLVM limitations).

@nikomatsakis nikomatsakis added A-ffi Area: Foreign Function Interface (FFI) and removed I-nominated labels Jul 11, 2019
@joshtriplett
Copy link
Member

Staring at the proposal for a while, and thinking about it further, I think if we document the assumption we're making (as well as noting all the cases we're covering like saturating to infinity) then I don't have a problem signing off on this.

In addition to the documentation changes already requested in the other issues, this needs to explicitly document the assumption that it'll be possible to set hardware floating-point modes to match Rust's expectations on rounding, and that we understand this could potentially limit our ability to use hardware floating point on some future platform that can't support that assumption.

@Havvy
Copy link
Contributor

Havvy commented Jul 18, 2019

Where would you document the assumption that it's possible to set the hardware FP modes? That seems to be more of a rule about developing Rust compilers than it does about the language itself. Are other similar restrictions documented anywhere?

@hanna-kruppe
Copy link
Contributor Author

When I was discussing with @joshtriplett on Discord they said to put these notes in the reference. I don't have strong opinions on where it goes, I just want to know what I have to add where to get this issue settled. Can you two (& anyone else who has strong opinions on the matter) please find some consensus about that so I know what patch(es) to write?

@Centril
Copy link
Contributor

Centril commented Jul 20, 2019

It's a bit odd to note in the reference what expectations on hardware that exist for Rust since that isn't necessary for the purposes of a definitional interpreter / abstract machine. That said, I suppose we can add a lightweight note about what it means for rustc? perhaps the rustc guide is a better place for this note?

@Havvy
Copy link
Contributor

Havvy commented Jul 21, 2019

Wherever we note this assumption by Rust, we should also make note that we assume that pointer width is at least 16 bits (via impl Into<isize> for i16) as well.

@Havvy
Copy link
Contributor

Havvy commented Jul 21, 2019

Actually, can we open a new issue for where these assumptions should be documented; it feels unrelated to actually making this choice and there are others we should do it for even if we choose not to specify the behavior in this issue.

@joshtriplett
Copy link
Member

@Havvy There are many assumptions we might wish to document, but I don't think we should block documenting one thing that we know about on trying to document others.

I'm just proposing that we have a quick footnote somewhere saying that by specifying this behavior we assume that we can efficiently implement such behavior on all hardware we might want to run on.

@hanna-kruppe
Copy link
Contributor Author

I'm just proposing that we have a quick footnote somewhere saying that by specifying this behavior we assume that we can efficiently implement such behavior on all hardware we might want to run on.

When you put it like that it seems kind of tautological. I stand by not really caring one way or another but may I suggest reconsidering what the note is intended to achieve specifically?

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 13, 2019

In addition to the documentation changes already requested in the other issues, this needs to explicitly document the assumption that it'll be possible to set hardware floating-point modes to match Rust's expectations on rounding, and that we understand this could potentially limit our ability to use hardware floating point on some future platform that can't support that assumption.

If documenting this is strictly necessary to make progress, @rkruppe I personally would think that a tautological note: "Note: if your hardware doesn't support these rounding semantics floats will be slow." maye in parentheses next or after the guarantees would be enough.

I'm not sure of the value of this note, but not documenting the semantics of x_f32 as i32 has a cost. A user just asked precisely that on Discord, and I pointed them to the reference, and had then to point them to the PR to the reference, and then here.

So I'll be fine with just merging @rkruppe PR to the reference, and opening an issue on the reference about the Note. We could add them right in the middle of the as-expression section, but there are so many things like this in which if your hardware doesn't support something your programs might run much slower (e.g. using i64 on hardware that doesn't support 64-bit integers), that if someone cares enough about documenting all of these it might be better to add an Appendix to the reference whose job is to document that kind of thing, and that we can hyperlink.

@joshtriplett
Copy link
Member

joshtriplett commented Aug 13, 2019 via email

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2019

Supporting 64-bit integers via 32-bit math isn't especially slow;
software floating point can be incredibly slow, compared to people's
expectations of floating-point performance.

My point is that there are many different levels of slowness for many of the features of the language depending on the target and that we don't document these anywhere - we don't even document that you should have an FPU to use f32 and f64 appropriately in the first place.

I really don't see what value does adding this note here add. It requires people looking for information about whether their target can run Rust programs efficiently to go through the whole Rust reference skimming for performance-related footnotes. They won't even find this in the Numeric types section introducing f32 or f64, and have to go to the section about as expressions instead.

If you think these issues are important and should be documented, I think it would be better to open an issue in the reference, where these issues are collected, and an appropriate approach to document them can be discussed.

@hanna-kruppe
Copy link
Contributor Author

I pushed an updated patch including performance note in rust-lang/reference#607, please check it out.

@joshtriplett
Copy link
Member

joshtriplett commented Aug 16, 2019

The updated version looks great! Thanks for your work on this.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Aug 16, 2019

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Aug 16, 2019
@hanna-kruppe
Copy link
Contributor Author

ping @nikomatsakis @pnkfelix @withoutboats for checkboxes

@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

Make the bot listen... @rfcbot reviewed

@rfcbot
Copy link

rfcbot commented Aug 29, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Aug 29, 2019
@rfcbot
Copy link

rfcbot commented Sep 8, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Sep 8, 2019
@Centril
Copy link
Contributor

Centril commented Sep 8, 2019

The reference PR rust-lang/reference#607 has been merged.
Closing this issue as "merged".

@Centril Centril closed this as completed Sep 8, 2019
Havvy pushed a commit to Havvy/reference that referenced this issue Aug 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ffi Area: Foreign Function Interface (FFI) disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants