Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading plus for string to integer parsing #27580

Closed
Cigaes opened this issue Aug 7, 2015 · 8 comments
Closed

Leading plus for string to integer parsing #27580

Cigaes opened this issue Aug 7, 2015 · 8 comments
Labels
I-needs-decision Issue: In need of a decision. P-low Low priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Cigaes
Copy link

Cigaes commented Aug 7, 2015

The parse::<i32>() function (and all its cousins for other integer types) fail for strings starting with a + with ParseIntError { kind: InvalidDigit }. Failing snippet:

fn main () {
  assert_eq!("+42".parse::<i32>().unwrap(), 42);
}

The same error occurs with the symmetric function call i32::from_str("+42").

Test snippet on Rust Playground
Rust Forum discussion

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2015

Need to decide if this is intentional or incidental.

CC @arthurprs who wrote the current implementation.

@Gankra Gankra added A-libs I-needs-decision Issue: In need of a decision. P-low Low priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 7, 2015
@arthurprs
Copy link
Contributor

Interesting. I'm 99% sure this was the behavior before my change though. https://github.com/rust-lang/rust/pull/27110/files#diff-01076f91a26400b2db49663d787c2576L1457

@Gankra
Copy link
Contributor

Gankra commented Aug 7, 2015

@arthurprs I agree, I just figured you might have an opinion.

@arthurprs
Copy link
Contributor

@gankro since all major languages accept the leading + we should probably do the same.

@hanna-kruppe
Copy link
Contributor

Bump: We should absolutely support leading +. It is super annoying to work around this when you're parsing something that permits leading +. I already encountered this twice in the last months, and I don't write a lot of Rust code to begin with.

@mstewartgallus
Copy link
Contributor

If you add support for a bunch of additional features then people end up needing to write wrapper functions around the implementation to stop those features. For example, I often end up with code that is similar to the following because I don't want to accept nonconforming input:

static linted_error parse_int(char const *str, int *resultp)
{
    linted_error err = 0;

    char start = str[0U];

    if (isspace(start))
        return EINVAL;
    if ('+' == start)
        return EINVAL;
    if ('-' == start)
        return ERANGE;

    errno = 0;
    long yy;
    char *endptr;
    {
        char *xx;
        yy = strtol(str, &xx, 10);
        endptr = xx;
    }
    err = errno;
    if (err != 0)
        return err;

    if (*endptr != '\0')
        return EINVAL;
    if (yy > INT_MAX)
        return ERANGE;

    *resultp = yy;
    return 0;
}

This is not to mention the fact that strtol is locale specific in C anyway but Rust doesn't have that problem.

I bet in the future that a LOT of people will be sloppy and not take care of corner cases and then their parsers will end up accepting nonconforming input for whatever program they write and so gratuitous incompatibilities and extra features between programs will proliferate. I feel that is a net harm.

@Gankra
Copy link
Contributor

Gankra commented Oct 7, 2015

@sstewartgallus This is an interesting perspective, thank you!

However it has been noted elsewhere (#28826) that the float parsing algorithm already does this, which suggests we should probably bite the bullet and do this just to be consistent.

@hanna-kruppe
Copy link
Contributor

Just because current nightly and beta accept leading plus on floats doesn't mean a decision has been made. Like I said, this was probably unconscious, it slipped through the cracks so to speak. If desired, I can easily alter the code to reject leading plus. It would make me sad, because I think it ought to be accepted, but it would be be a one-minute fix and could be backported with little danger.

@sstewartgallus In broad strokes, you are right and this is a real concern. On the other hand, there are also formats where leading plus is legal (for example, the exponents in floating point: 1.5e+3, 1.0Fp+10). Working around that if str::parse does not accept leading plus is about as much work as the part of your example that rejects leading plus. Of course, this way around the incompatibility is a bit more likely to be noticed early, but it's still an edge case.

Additionally, it's not like all occurrences of integers in fixed data formats are floating around in a vacuum. Often some code has to identify the substrings containing them, then pass those substrings to str::parse — and this code is also complicit in accepting or rejecting leading plus.

In total, I don't think in this instance there is a significant hazard one way or another, it's simply a matter of consistency with other languages and doing the thing that most users find convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-needs-decision Issue: In need of a decision. P-low Low priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants