Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Unsafe Lifetime #3199

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

maboesanman
Copy link

@maboesanman maboesanman commented Nov 25, 2021

Introduce a new special lifetime 'unsafe which is outlived by all other lifetimes. Using a type through a 'unsafe reference, or which is instantiated with an 'unsafe lifetime parameter is rarely possible without unsafe.

RENDERED

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 25, 2021
@maboesanman maboesanman marked this pull request as ready for review November 25, 2021 03:04
@clarfonthey
Copy link
Contributor

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

@maboesanman
Copy link
Author

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

I considered 'unsafe, but it isn't actually unsafe to have one of these references and store it. Using it is what breaks down to unsafe.

I picked ? Because it wouldn't require an edition boundary (I think), especially since the next one is 2024. Are there any reserved lifetime names that this feature could claim?

@clarfonthey
Copy link
Contributor

clarfonthey commented Nov 25, 2021

As I said, I think we can get away with it working without an edition bump if we just require that it not be defined in the lifetime parameters; i.e. pub struct Struct<'unsafe> would mean that 'unsafe refers to the user-defined lifetime, and pub struct Struct<> would mean that 'unsafe refers to the RFC-defined lifetime.

The main benefit of an edition bump is that it would become a compile error on the future edition to shadow the lifetime, just like how you can't define your own lifetime named 'static right now. Prior to an edition bump we could probably have a deny-by-default lint for it.

Personally, I think that using a keyword would be a bit more clear, as the '? syntax seems weird to me and too similar to '_. I personally think that calling it unsafe means that it's unsafe to dereference, not unsafe to have, but I guess that I'll defer to what everyone else thinks.


If you try to call a method whose arguments or return value include `'?`, that call will need to be wrapped in unsafe, because you are asserting that you know those references are valid despite the borrow checker not knowing.

The addition of the `'?` lifetime also means the addition of two new reference types, `&'? T` and `&'? mut T`. These are in a sense halfway in between references and pointers. Dereferencing them is unsafe. Static references can be coerced into normal references, which can be coerced into unchecked-lifetime references, which can be coerced into raw pointers. The crucial difference between `&'? T` and `*const T` is that it is considered unsound for `&'? T` to be unaligned at any time, instead of only at the time of dereference for raw pointers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth also mentioning that it would also be non-null, even though NonNull<T> exists to provide that for pointers.

@maboesanman
Copy link
Author

The problem is you could have a function with an unchecked lifetime in the signature, that would collide with an existing function that names it's lifetime 'unchecked.

@kennytm
Copy link
Member

kennytm commented Nov 25, 2021

see also #1918 (postponed) for the previous attempt at 'unsafe

@fbstj
Copy link
Contributor

fbstj commented Nov 25, 2021

reading through this I don't see why it isn't the same as '_ in all the explanations

@maboesanman
Copy link
Author

reading through this I don't see why it isn't the same as '_ in all the explanations

You can't use '_ in a field of a struct, which is where most of the value of this comes from. In fact '_ outlives '? because '_ is always resolved to a lifetime.

@maboesanman
Copy link
Author

Bikeshedding: it might require a new edition to make this work fully, but why not just call it 'unsafe if this is supposed to be an unsafe lifetime? Before a new edition we could just say that an explicitly defined 'unsafe lifetime would shadow this.

Turns out keyword names aren't allowed in lifetimes anyway, so 'unsafe is just usable. I think I will edit the rfc to use 'unsafe instead, as it's reserved and you've convinced me it's clearer.

@maboesanman
Copy link
Author

see also #1918 (postponed) for the previous attempt at 'unsafe

I hadn't seen this rfc. I will go into some detail on this in prior art because it shares a lot with what I'm trying to do but I think this rfc is a little more precise in the approach.

@burdges
Copy link

burdges commented Nov 25, 2021

I'd assume niko's comment #1918 still applies today, so anything like this should be many years away.

It's breaks abstraction boundaries if people instantiate foreign types' lifetime bounds with 'unsafe, which becomes problematic.

Instead, we'd need a default bound 'a : 'safe or 'a : !'unsafe on every lifetime 'a, like the T: Sized default now, but then specific uses could opt out by writing explicitly larger bounds like where 'a : ?'safe or where 'a : 'unsafe or where 'a : ?!'unsafe, ala T: ?Sized now. And lifetime elision always forbids 'unsafe in particular.

We'd need the libs team to figure out which core/std components should be changed from the default bound of 'a : 'safe to the weaker 'a : ?'safe.


We could plausibly treat unsafe more like an adjective, ala 'unsafe a, yielding some local lifetimes that still obey some rules and never escape, not sure if this solves the underlying problem, but maybe it helps while avoiding the default bounds. I suppose #1918 helps answer this question.

@maboesanman
Copy link
Author

It's breaks abstraction boundaries if people instantiate foreign types' lifetime bounds with 'unsafe, which becomes problematic.

This is the crucial difference between these two RFCs. 'unsafe cannot be used in place of another lifetime in function signatures because it is shorter than any lifetime, so no function that is expecting a normal lifetime can be called with 'unsafe instead. If you want to call that function you must transmute into a real lifetime, which is unsafe.

@burdges
Copy link

burdges commented Nov 25, 2021

Interesting, I'd missed this aspect, thanks. If I understand, &'unsafe T could equally be called *aligned const T, no? What is the *mut T analog of &'unsafe mut T?

maboesanman and others added 3 commits November 25, 2021 09:25
Co-authored-by: Jacob Lifshay <programmerjake@gmail.com>
@maboesanman maboesanman changed the title Unchecked Lifetime Unsafe Lifetime Nov 25, 2021
@maboesanman maboesanman changed the title Unsafe Lifetime RFC: Unsafe Lifetime Nov 29, 2021
Co-authored-by: Noah Lev <camelidcamel@gmail.com>
@earthengine
Copy link

because 'unsafe is shorter than any lifetime, a function which is generic over some lifetime parameter expects something which could be assigned a lifetime parameter.

'static >= 'a > 'unsafe

for all values of 'a. if instead we had 'a >= 'unsafe, you would be correct.

So, we have to update the document to say that for<'a> generic actually means "for all SAFE lifetimes", not "for ALL lifetimes". This creates a sort of confusion to the language learners.

Co-authored-by: teor <teor@riseup.net>
@Aloso
Copy link

Aloso commented Dec 4, 2021

I'm not convinced that this is a good idea, for the following reasons:

  1. The documentation for &T states that “a reference is just a pointer that is assumed to be aligned, not null, and pointing to memory containing a valid value of T. This will no longer be true if a 'unsafe lifetime is added.

  2. It causes mental overhead, because it adds lots of edge cases:

    • References can always be safely derefenced except for &'unsafe T
    • for<'a> works with any lifetime except 'unsafe
    • A lifetime parameter of a function/impl/trait can be instantiated with any lifetime, except 'unsafe
      (but instantiating a struct's lifetime parameter with 'unsafe is fine apparently)
  3. It makes Rust harder to learn and to teach.

  4. I'm not convinced that it's the best solution for the problem.

To elaborate my last point: The only use case mentioned in the RFC are self-referential structs. If these are the main focus, then a 'self lifetime could be considered as an alternative. Another alternative that the RFC should talk about is to "do nothing".

@oskgo
Copy link

oskgo commented Dec 5, 2021

'unsafe is shorter than any lifetime

The RFC says that 'unsafe is shorter than any other lifetime. Is 'unsafe shorter than 'unsafe or not?

This shouldn't have an effect on behavior because of rule 2, but I think it matters in how we justify the behavior of unsafe lifetimes.

'unsafe cannot be used in place of another lifetime in function signatures because it is shorter than any lifetime, so no function that is expecting a normal lifetime can be called with 'unsafe instead. If you want to call that function you must transmute into a real lifetime, which is unsafe.

This looks to me like it's making the transmute function a special case or you wouldn't be able to call it on something with an unsafe lifetime. If that is the case, then the RFC should mention this. If not, then I wonder what you mean by "function that is expecting a normal lifetime".

@Aloso
Copy link

Aloso commented Dec 5, 2021

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

// any lifetime except 'unsafe:
fn foo<'a>(x: &'a i32) {}

// any lifetime, including 'unsafe:
fn bar<'a: 'unsafe>(x: &'a i32) {
    // unsafe is needed here to dereference x!
}

foo::<'unsafe>(&42); // forbidden
bar::<'unsafe>(&42); // allowed

'a: 'unsafe, as in "'a outlives 'unsafe", is trivially true if 'unsafe: 'unsafe, so this bound would have to have a special meaning.

@earthengine
Copy link

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

This is pretty similar to Sized. A normal generic type parameter is considered Sized unless you explicitly say T: ?Sized.

@ghost
Copy link

ghost commented Dec 7, 2021

Going back to @burdges comment here:

Interesting, I'd missed this aspect, thanks. If I understand, &'unsafe T could equally be called *aligned const T, no? What is the *mut T analog of &'unsafe mut T?

In terms of the expressive power that this brings to the type system, what is missing from the taxonomy of pointers is a guaranteed-aligned pointer type without a lifetime. As the last sentence of this RFC mentions, that would be quite useful for any data structure implementation or FFI code to declare at the type level that pointer is "just" aligned (and maybe also non-null).

I can see benefit from the restricted form 'self for references, but perhaps we should have *aligned const T/*aligned mut T or core::ptr::Aligned<T> for the general case. Is there possibility for safe use of &'self T if it is more restricted than the current 'unsafe proposal?

@programmerjake
Copy link
Member

for core::ptr::Aligned<T>, see also #3204

@JakobDegen
Copy link
Contributor

I had originally posted these concerns on the Zulip, but that conversation has died down so I'll repost here

In my opinion, there's an aspect of this that is way under-specified and probably a massive issue, and that is the implications of this for type checking. Thinking in terms of the type system for a second, its clear that 'unsafe can't actually mean "the shortest lifetime" because that would be incredibly unsound for contra-variant lifetimes. Instead, it has to be some kind of non-lifetime that can be used in place of a lifetime, but isn't a lifetime at all. What does this mean for type checking though? Consider, for example

trait A {
    type Assoc;
    
    fn get(&self) -> Self::Assoc;
}

impl<'a> A for &'a i32 {
    type Assoc = i32;

    fn get(&self) -> i32 {
        **self
    }
} 

struct S<'a> {
    v: <&'a i32 as A>::Assoc,
}

fn f(s: S<'unsafe>) {
     // what happens here?
}

How is the behavior of the type checker meant to change in the body of f? In the past, it would have been allowed to use S<'unsafe> being well-formed to conclude that <&'unsafe i32 as A>::Assoc is well-formed, and hence &'unsafe i32: A. But that's not the case! In other words, getting this kind of change through requires fundamentally changing the rules for type checking, at least around this 'unsafe lifetime, and exactly how that is to work needs to be 1) a part of the RFC, and 2) designed with extreme care to ensure safety guarantees are upheld

To be clear: Enforcing that lifetime generics in scope for functions is enough, as far as I can tell, to ensure the continued soundness of any existing code. What is not clear at all is how this should work in a way that doesn't lead the trait solver to make incorrect deductions.

Maybe the particular example above can be fixed by deciding that either the "S<'unsafe> well-formed implies <&'unsafe i32 as A>::Assoc is well formed" or the "<&'unsafe i32 as A>::Assoc is well formed implies &'unsafe i32: A" deductions are incorrect, but which one, and why? Furthermore, can you prove that this is enough in general? What are the side-effects?

I do think there's genuinely a good idea here, and that this kind of type would be useful even outside of unsafe code, but the right process would probably be to think more about the motivation and use cases, and then file a lang MCP so that the work to design the resulting type system correctly can be put in.

@maboesanman
Copy link
Author

I believe to make this work and be useful, it must be possible to opt into 'unsafe when declaring lifetime parameters:

This is pretty similar to Sized. A normal generic type parameter is considered Sized unless you explicitly say T: ?Sized.

I think this can be used to address @JakobDegen 's concerns, as well as clarify the discrepancy between lifetime parameters' bounds in impls/fns vs structs/enums. This also gets around the backwards compatibility requirement that the lifetime uses a keyword name.

add one new lifetime and two new lifetime bounds:

'unsafe

'a: '!unsafe and 'a: '?unsafe

'a: '!unsafe would be implicit on any lifetime parameter introduced by an implicit block or by a function. A notable implication of this is that T<'unsafe> doesn't impl the traits that T<'a> does, unless:

'a: '?unsafe would usable on lifetime parameters introduced on functions or impl blocks, opting out of the implicit bound above, allowing traits to be implemented for types instantiated with the unsafe lifetime.

the reverse is the case for types:

'a: '?unsafe would be implicit for all lifetime parameters defined in a struct or enum. If you want your struct to opt out of this behavior, you can use 'a: '!unsafe (it's not clear to me why this would be required, so possibly the '!unsafe bound could be avoided completely.

To avoid naming collisions, 'unsafe (or whatever it is actually called) could be able to be shadowed by lifetime parameters with a warning.

A notable name suggestion from Zulip is 'erased ('a: '?erased, 'a: '!erased)

@JakobDegen
Copy link
Contributor

@maboesanman what you are describing there is, to me at least, not new, and this is how I had been interpreting things already. (We may decide later that we don't actually want to allow people to specify non-default constraints, but that's a separate issue). The example I posted still has issues, since it shows one (but not all) ways to turn a lifetime generic on a type into a lifetime generic on an impl.

@maboesanman
Copy link
Author

maboesanman commented Dec 9, 2021

@JakobDegen the type of v is invalid because the the lifetime 'a is '?unsafe but it is required to be '!unsafe in order for the <&'a i32 as A> coercion to work.

But your example proves that both struct and impl lifetime parameters need to be '!unsafe, which means only types which explicitly allow instantiating with the unsafe lifetime can be used, which is still useful.

Maybe the struct/enum explicit bound can be removed on an edition bump?

@burdges
Copy link

burdges commented Dec 9, 2021

At this point, I'd suspect 'unsafe becomes too messy to explain, so better off using *aligned const T/*aligned mut T or core::ptr::Aligned<T>.

Arguably *T/*mut T could've been *'nasty T/&'nasty mut T or whatever, using the rules discussed here, but Rust never took that approach, so probably unwise now too.

@maboesanman
Copy link
Author

@burdges If declared lifetime parameters are always '!unsafe then the mental overhead is similar to that of raw pointers. "If you allow a type parameter to be '?unsafe then it is up to you to ensure uses are valid"

I think *aligned const is a reasonable feature but that doesn't do anything to enable storage of values which are generic over lifetimes that can't be understood by the borrow checker.

@JakobDegen
Copy link
Contributor

But your example proves that both struct and impl lifetime parameters need to be '!unsafe, which means only types which explicitly allow instantiating with the unsafe lifetime can be used, which is still useful.

But doesn't this explicitly contradict the example given in the motivation, since now this no longer works as a solution unless the problem types explicitly opt in (in which case they might as well just write a pointer based replacement?)

Maybe the struct/enum explicit bound can be removed on an edition bump?

I believe this is technically compatible, but it is a massively breaking change, and really not in the spirit of edition changes, because the cargo fix would need to add this bound to every generic parameter on every type (since even if its not needed now, not including it would introduce a backwards incompatible change to the affected API)

@maboesanman
Copy link
Author

But doesn't this explicitly contradict the example given in the motivation

It does. I hadn't considered that example and you're right. However I think there's still value in adding this for cases where you want to a type you've made in both a self referential and normal context.

because the cargo fix would need to add this bound to every generic parameter on every type

I think it would only need to add the bound in the case where you rely on 'a: '!unsafe, which is only when associated types are used.

@ghost
Copy link

ghost commented Dec 9, 2021

I think *aligned const is a reasonable feature but that doesn't do anything to enable storage of values which are generic over lifetimes that can't be understood by the borrow checker.

If the main motivating example here is self-referential types, are there cases where a hypothetical 'self lifetime could allow those to be safely constructed and used? It seems more in line with what most people expect to keep references/lifetimes constrained to only what safe code can use/what the borrow checker can check, and drop into lifetime-erased pointer types for anything more complex.

@JakobDegen
Copy link
Contributor

I think it would only need to add the bound in the case where you rely on 'a: '!unsafe, which is only when associated types are used.

@maboesanman possibly, but now you've secretly changed that adding an associated type as a field is a breaking change. That seems no good

@bowlerman
Copy link

Your main issue is that the generic T can have lifetimes, so why not deal with those instead?

Wouldn't a way to extract the lifetime from the generic solve your problem?

Is this what you mean in your 'self alternative?

@maboesanman
Copy link
Author

adding an associated type as a field is a breaking change. That seems no good

I view this as similar to breaking a contract by adding a !Unpin type as a field.

@JakobDegen
Copy link
Contributor

I view this as similar to breaking a contract by adding a !Unpin type as a field.

Sorry, should have been more clear. This being a thing at all is ok (I suppose), but changing this from not being a thing to being a thing, for existing code, is a pretty big surprise. I'm not sure that's such a good idea.

@fee1-dead
Copy link
Member

Any updates on this?

@SoniEx2
Copy link

SoniEx2 commented Oct 20, 2022

may we reuse parts of this for our own RFC?

@maboesanman
Copy link
Author

Of course! My takeaway from this rfc was that I had a somewhat incomplete understanding of some important concepts, but I still think there is some value in the ideas here. If you can extract that value that would be great!

@SoniEx2
Copy link

SoniEx2 commented Oct 22, 2022

uh we mean... yeah, we do think your proposal is valuable, tho we haven't really been able to do much with it other than point out how it's unsound... >.<

we did take some syntactic inspiration from it tho, if that counts for anything. but we're going a completely different direction. hmm, tho now that we think of it, we haven't considered interactions with (generic) associated types for our proposal...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.