Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: implement Add and AddAssign for ascii::Char #120219

Closed
wants to merge 1 commit into from

Conversation

mina86
Copy link
Contributor

@mina86 mina86 commented Jan 22, 2024

Implement Add and AddAssign for ascii::Char which panics in debug
builds if sum isn’t an ASCII character and wraps the result in release
builds. The operators can be used to convert an integer into a digit
or a letter of an alphabet:

use core::ascii::Char;

assert_eq!(Char::Digit8, Char::Digit0 + 8);
assert_eq!(Char::SmallI, Char::SmallA + 8);

Issue: #110998

@rustbot
Copy link
Collaborator

rustbot commented Jan 22, 2024

r? @cuviper

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 22, 2024
@mina86
Copy link
Contributor Author

mina86 commented Jan 22, 2024

@scottmcm, this is the Add implementation I’ve suggested a while ago. It obsoletes digit and digit_unchecked so I think those methods can be removed.

Alas, the issue is that I cannot figure out tests. I was expected debug_assertions to be defined and thus the two overflowing tests to panic alas that doesn’t happen.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Jan 22, 2024
@scottmcm
Copy link
Member

Tagging this for some libs-api eyes, since I think this is the first Add on something with a niche, and thus it's not obvious to me what behaviour is desired for it, or even if it should actually be the trait.

For example, https://doc.rust-lang.org/nightly/std/num/struct.NonZeroU32.html doesn't implement Add, but does have wrapping_add and checked_add and unchecked_add. So perhaps this also shouldn't have Add, just the methods?

(I'm also personally a bit skeptical how often + is really the right phrasing here, vs something specific to digits or letters. We don't have Add implemented on char, for example, and instead say to do math on u32 then convert back to char. So it might also be fine to say that people wanting to do math on ascii::Char should do that math on u8, perhaps with a ascii::Char::from_u8_wrapping for convenience if they don't care about overflow.)

@tmvkrpxl0
Copy link

tmvkrpxl0 commented Jan 22, 2024

NonZero types have special add functions because it must not have 0.
I think there should be similar restriction for ascii char as values >= 128 is invalid.
However this has more usecases than just digits, such as adding 5 to 'a' get 'f'.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

We're still vigorously debating the merits of the Add implementation, as well as the merits of adding a method.

However, we all agreed that we want the From impls being proposed here, and we also all agreed that we want to keep the digit methods (whether they remain as-is or change to match char::from_digit, which we can decide before stabilizing them).

So, could you please:

  • Drop the removal of digit/digit_unchecked from this PR
  • Move the From impls from this PR to a separate PR, which we can go ahead and r+ as soon as you post it.

@mina86
Copy link
Contributor Author

mina86 commented Jan 24, 2024

There is a difference between char, NonZero and AsciiChar in regards to Add. AsciiChar can be thought of as u7 where addition has an obvious wrapping behaviour of keeping only the seven least significant bits of the result. There is no such clear rule for char or NonZero. char is not quite 21-bit and both have forbidden representations less than their max value.

Move the From impls from this PR to a separate PR, which we can go ahead and r+ as soon as you post it.

#120311. I also took the liberty of dropping to_u8 and to_char in that PR as well as adding conversion to all other unsigned integers. I’m not fundamentally opposed to those methods but char doesn’t have them so such change matches API that char has.

@rust-log-analyzer

This comment has been minimized.

@mina86 mina86 changed the title core: implement Add for ascii::Char; drop ascii::Char::digit core: implement Add and AddAssign for ascii::Char Jan 24, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@mina86
Copy link
Contributor Author

mina86 commented Jan 24, 2024

Remaining issue is with some snapshot tests which I’m not clear how to update. Since this is draft and under discussion whether it’s going to be accepted I’m leaving it for now.

@cuviper
Copy link
Member

cuviper commented Jan 25, 2024

I'm clearing this from my queue while it's in draft, but you can use ready when you are.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2024
@rust-log-analyzer

This comment has been minimized.

Implement Add and AddAsign for core::ascii::Char treating it as if it
was a 7-bit unsigned type, i.e. in debug builds overflow panics and in
release seven least significant bits of the result are taken.

The operators can be used to convert an integer into a digit or
a letter of an alphabet:

    use core::ascii::Char;

    assert_eq!(Char::Digit8, Char::Digit0 + 8);
    assert_eq!(Char::SmallI, Char::SmallA + 8);

Issue: rust-lang#110998
@mina86 mina86 marked this pull request as ready for review January 27, 2024 15:33
@mina86
Copy link
Contributor Author

mina86 commented Jan 27, 2024

@rustbot ready

I guess it’s up to you guys now to decide whether you want it or not. IMO it does make sense due to reasons outlined in earlier comment.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 27, 2024
@cuviper
Copy link
Member

cuviper commented Jan 27, 2024

We're still vigorously debating the merits of the Add implementation,

r? libs-api

@rustbot rustbot assigned Amanieu and unassigned cuviper Jan 27, 2024
@kupiakos
Copy link
Contributor

kupiakos commented Feb 7, 2024

Note for reviewers, there's some community debate for this starting in #110998 (comment)

@Amanieu
Copy link
Member

Amanieu commented Feb 20, 2024

We discussed this in the libs-api meeting today. The team was somewhat split on this issue, but ultimately we decided not to accept this change. The primary reason is that addition is not a concept that really make sense on characters: we would prefer that code use to_u8 and from_u8 instead for the rare cases where arithmetic on ascii values is needed. Note that this also applies to other things like using bitwise operations to toggle the 6th bit which changes letters between uppercase and lowercase.

@Amanieu Amanieu closed this Feb 20, 2024
@m-ou-se
Copy link
Member

m-ou-se commented Feb 21, 2024

To clarify, not everyone had objections to this change, but additions like these require team consensus, which cannot be reached when several people have objections.

For me personally, my objections are:

  • Wrapping around at 7 bits is surprising, since no native integer does that.
  • char doesn't implement Add either. (Note that it does implement Step, which just skips the unused ranges, like the proposed Add impl does for ascii::Char.)
  • ascii::Char is similar to NonZeroU8: a u8 with a smaller range (and NonZeroU8 is to CStr what ascii::Char is to ascii strings, basically), but NonZeroU8 also doesn't have (and shouldn't have) Add. (Imagine an Add for NonZeroU8 that would silently wrap from 255 to 1 in release mode. I don't think we should have that.)
  • There are only a few useful basic patterns with addition for ascii chars, such as b'0' + digit and b'a' + digit - 10 and lowercase - b'a' + b'A' which would be more readable as dedicated methods (like to_ascii_upper(), etc.). More advanced ascii bit twiddling hacks will probably use something from_u8 or from_u8_unchecked, to avoid the unnecessary masking that'd be done by the Add implementation.

@dtolnay dtolnay removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.