Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent bound requirements on arrays with generic sizes #79778

Closed
calebzulawski opened this issue Dec 7, 2020 · 12 comments
Closed

Inconsistent bound requirements on arrays with generic sizes #79778

calebzulawski opened this issue Dec 7, 2020 · 12 comments
Labels
A-const-generics Area: const generics (parameters and arguments) C-bug Category: This is a bug.

Comments

@calebzulawski
Copy link
Member

I'm working on integrating const generics into stdsimd. We want to support two types of bit masks. The first has an entire mask per lane, e.g.:

struct Mask32<const LANES: usize>([u32; LANES]);

The other type of mask only uses a single bit per lane, e.g.:

struct BitMask<const LANES: usize>([u8; (LANES + 7) / 8]);

This doesn't work, however, and errors with error: unconstrained generic constant. After finding #68436, I was able to get it working with:

struct BitMask<const LANES: usize>([u8; (LANES + 7) / 8]) where [u8; (LANES + 7) / 8]: Sized;

This seems inconsistent since the two types are negligibly different. Also, since this type will be public, the bound bleeds into the interface which is undesirable.

@calebzulawski calebzulawski added the C-bug Category: This is a bug. label Dec 7, 2020
@calebzulawski calebzulawski changed the title Inconsistent bound requirements on generic arrays Inconsistent bound requirements arrays with generic sizes Dec 7, 2020
@calebzulawski calebzulawski changed the title Inconsistent bound requirements arrays with generic sizes Inconsistent bound requirements on arrays with generic sizes Dec 7, 2020
@camelid camelid added the A-const-generics Area: const generics (parameters and arguments) label Dec 7, 2020
@newpavlov
Copy link
Contributor

Currently it's an intended behavior (with which I also disagree). See this comment by @oli-obk.

@jonas-schievink
Copy link
Contributor

Yes, this is working as intended

@calebzulawski
Copy link
Member Author

Is this something likely to ever change, or is the behavior expected to stabilize like this? I'll have to discuss with the rest of the portable SIMD project group but this may preclude a const generic implementation.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 7, 2020

I'm still very much in the court of requiring bounds just like we require trait bounds on types before being able to use them beyond moving or otherwise directly forwarding them. This is definitely open for discussion, just like anything, but I'd really prefer it if we stabilized the current version first and thought about convenience extensions second. Ideally any convenience features would also apply to types.

Note that you can create a const fn that does your math, thus requiring you to only bubble up some_fn(LANES) instead of the actual math. This is very similar to bubbling up T: Trait bounds.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 7, 2020

Oh, also note that callers who supply concrete integers will not have to do anything, the bound will be evaluated once at the call site and either error or just work if the expression evaluates successfully.

@calebzulawski
Copy link
Member Author

That's a good point about the const fn, though that would still end up with an extra implementation detail in std. That still exposes [u8; bytes(LANES)]: Sized or similar which is still very much an implementation detail. I suppose my question then, why is the bound not required for the first example? Something like:

struct Foo<N>([u8; N]) where [u8; N]: Sized;

I do understand that it's a bound, but it's unfortunate because I do think this is a common pattern (I'm basically just trying to replicate C++'s std::bitset). I think as it stands right now it would be impossible to implement a type like this:

pub(crate) struct Wrapper(u8);
pub struct Array<const N: usize>([Wrapper; N / 8]) where [Wrapper; N / 8]: Sized;

@oli-obk
Copy link
Contributor

oli-obk commented Dec 7, 2020

you do not actually need the Wrapper in the array here. We currently just have no good way of specifying bounds for constants, this will likely be remedied before stabilization. So right now you can do where [(); N / 8]:;, not even a Sized bound necessary iirc.

That still exposes [u8; bytes(LANES)]: Sized or similar which is still very much an implementation detail.

I think you would probably name your const fn something like valid_lanes, which usize::max_value() would not satisfy, because it would overflow.

@calebzulawski
Copy link
Member Author

Okay, I think I misunderstood what that bound was representing. Perhaps it could be be possible to indicate that a bound is always valid? In my example above, N / 8 is always valid.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 10, 2020

I guess some kind of opt out of the system could be reasonable. I'm trying to come up with an example in the type system and trait bounds that have a similar situation. I guess it's when you do impl<T> SomeTrait for T, you always have access to SomeTrait, even if you did not specify that bound. We should figure out all things like this in the trait system, look at how this could be used in typenum to get the effects we want, and then come up with a scheme to get it for const generics. As long as we have a trait system mirror of something, I think it is fairly uncontroversial and would probably be an easy RFC to get through

@newpavlov
Copy link
Contributor

newpavlov commented Dec 10, 2020

@oli-obk
I think the trait analogy should be applied a bit differently. Let's take a look at the following example:

fn foo<const N: usize>(val: Val) -> [u8; bar(N)] where [u8; bar(N)]: Sized { .. }

A trait analog of this function is:

fn foo<N: Unsigned>(val: Val) -> Bar<N> { .. }

We don't need the sized bound here because, contrary to the bar constant function, we are sure that Bar<N> will always evaluate to a valid type (and we do not care about preventing "too big" errors using bounds). A general const fn not only may not terminate, but also can panic, which includes potential overflow and underflow errors. For example , N.wrapping_add(N) is a valid body for the bar function. Such body will return result for any input N, so the bound is redundant.

But if I am not mistaken, strictly speaking it's not guaranteed that Bar<N> will terminate either. Rust type system is Turing-complete, so we can not guarantee termination (IIRC there exists some kind of type recursion limit error to terminate compilation in pathological cases), but in practice we extremely rarely encounter such errors to bother adding bounds to prevent them.

So the only difference between traits and const expressions used in type construction is potential panics. In other words, I think that the mandatory sized bound effectively ensure that const computations used in type construction will not panic for every possible input. And as I've argued in the other discussion, I don't think it's worth to introduce mandatory manual bounds just for that.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 10, 2020

and we do not care about preventing "too big" errors using bounds

we do not care about these for constants either.

A trait analog of this function is: [...]
we are sure that Bar will always evaluate to a valid type
A general const fn not only may not terminate, but also can panic, which includes potential overflow and underflow errors.
But if I am not mistaken, strictly speaking it's not guaranteed that Bar will terminate.

Ah, I do see how my post was actively misleading. I am not trying to have termination guarantees or similar here, but had something like bound checks failing in mind. Example:

struct Bar<T: Cake>(T);

fn foo<N: Unsigned>(val: Val) -> Bar<N> { .. }

Here we will fail to compile foo, because N does not satisfy Cake, so we have to bubble up Cake, because that's a requirement for Bar to be well formed.

So if we go back to const generics, I think the equivalent example is

const fn bar(n: usize) -> usize {
    assert!(cake(n));
    n
}
fn foo<const N: usize>(val: Val) -> [u8; bar(N)] where [u8; bar(N)]: Sized { .. }

now, we don't have a way to specify that bar(N) will work for any N that satisfies cake(N), so we need to bubble up the bar(N). So, yes, you are right, const generics are not en par with regular generics, since there's a bunch of things we cannot specify. The really obvious example here being

fn foo<const N: usize>() {
    let x: [u8; N / 8] = ...;
}

because that has no possible failure condition, but we still expect you to prove that N / 8 does not fail to compile by specifying an appropriate where bound.

In the type system the equivalent is

fn foo<T: Mep>(t: T) {
    let x: &Trait = &t;
}

which is obviously ok if

trait Trait {}
impl<T: Mep> Trait for T {}

So... T: Trait is correct for all T: Mep, and we can rely on this even if no T: Trait bounds are available.

I think everyone is on board with N / 8 being correct for all N: usize. The question now is how we can teach the const generics type system about things like this, whithout causing more and more edge cases we need to handle and more and more complex bounds to be specified. I do recognize that there is desire to have a more convenient system, but if we go with the duck typing system from the start, we can't make it more strict later. Thus I believe the best way forward is to stabilize the most strictest system first, look at the pain points in practice and then figure out various strategies to reduce the bounds required for certain operations. We've run into serious problems before with stabilizing a convenient system (promotion, I am looking at you), and if we have an obivous sound strategy that will work and allow everything, then we should start out with this. If we go for the full moon shot from the start, we may not get anything stabilized here, just remember how long it took for const generics to become a thing (and they are not stablized yet, but it looks like soonish).

Perhaps it could be be possible to indicate that a bound is always valid? In my example above, N / 8 is always valid.

N / 8 is always valid, but N + 7 is not, even if in a purely mathematical universe (N + 7) / 8 will always fit into whatever maximum value N had. This is one of the things I would like to figure out. Maybe we can handle (N as usize + 7) / 8 for N: u8, or maybe we can use N / 8 + ((N % 8 > 0 as usize) with some sort of value range analysis. I don't know. The design space for possible convenience improvements on top of const generics is super large and we should explore it in all directions.

@newpavlov
Copy link
Contributor

newpavlov commented Dec 10, 2020

I do recognize that there is desire to have a more convenient system, but if we go with the duck typing system from the start, we can't make it more strict later.

I understand and fully support the desire to promote approaches which result in a more reliable code. But we should carefully weight ergonomic costs and improvements which will be achieved in practice. I do not oppose the cautious approach of introducing a more conservative system first and then gradually relax it. I just argue about a general direction. I simply do not see a principal difference between const expressions used in constants and type:

trait Foo<const N: usize> {
    const FOO: usize = bar(N);   
    fn foo() -> [u8; bar(N)] where [u8; bar(N)]: Sized;
}

The associated constant does not require the bound, even though it will cause the same compilation error in the case of a bad N as in the method. So for consistency we either should require non-panicking bounds for constants as well (unfortunately, IIUC it is not possible due to the backward compatibility guarantees), or we should allow lack of bounds for both. I understand the need of bounds for diagnostics, but I think for this purpose they can be tracked implicitly by compiler.

The question now is how we can teach the const generics type system about things like this, whithout causing more and more edge cases we need to handle and more and more complex bounds to be specified.

Since we apparently agree that the main problem lies in possibility of panics in const expressions, I think a really great solution would have been a nopanic effect system, i.e. an ability to specify that function will never panic. Such effect system would be useful outside of const generics as well, e.g. in cryptographic code or in high-reliability systems. Another potentially useful effect would be one for ensuring termination in limited time (i.e. functions would have to be written in a total subset of Rust), it would be useful for real-time systems, async code, and some targets like eBPF, but I digress...

For such nopanic system to be practical, ideally it should be introduced together with a system for adding restrictions on input arguments, something like:

// `foo` will not panic for any `n`, for which `cake(n)` is true.
// In runtime code compiler will insert asserts before calling this function.
const nopanic fn foo(n: usize) where cake(n) { .. }

But even without such system, it would allow to cover most simple cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-generics Area: const generics (parameters and arguments) C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

5 participants