Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

What's the definition of mutable aliasing for ZSTs? #44

Closed
scottmcm opened this issue Jan 3, 2018 · 14 comments
Closed

What's the definition of mutable aliasing for ZSTs? #44

scottmcm opened this issue Jan 3, 2018 · 14 comments

Comments

@scottmcm
Copy link
Member

scottmcm commented Jan 3, 2018

For sizeof(T)>0, I understand the rules: no two &muts can reference overlapping memory.

What exactly are the rules for ZSTs, though?

For x: ((),()) to be a ZST (which we want), &mut x.0 and &mut x.1 are allowed, but are NOP transformations of &mut x, with the same value. But with unsafe code, I can take &mut x as *mut _, perform two NOP transformations to it, cast it as *mut (), and dereference it, and I don't know how to determine whether I'm reading the "first" or "second" fields, and thus whether I'm violating aliasing. Similar arguments apply to things like split_at_mut on a &mut [()], which is also creating multiple pointers of the same value.

But if all ZST reads are legal, that means all ZSTs are effectively Copy, which means a private-constructor ZST cannot safely be used as an access token, as it can be copied by ptr::read'ing it twice (legal because all ZST reads are legal, by premise).

@nagisa
Copy link
Member

nagisa commented Jan 3, 2018

This is pretty easy to answer. Aliasing only occurs when memory regions, within which the objects are stored, overlap. Since the memory regions occupied by ZSTs are, obviously, zero-sized, there can never be any overlap, even for pointers with the same address.

@RalfJung
Copy link
Member

RalfJung commented Jan 3, 2018

I agree with @nagisa. Moreover, this has no bearing on whether a private-constructor ZST can be used as an access token: If unsafe code forges such a private-constructor ZST, while no immediate UB is raised, that's still clearly misbehaving unsafe code. Compare this to a repr(C) struct with two private fields that has an invariant: While it is no immediate UB to use unsafe code to modify the fields of this struct, that's still misbehaving unsafe code and safe code may rely on unsafe code not doing this.

"What code is UB" and "What safe code can rely on unsafe code to (not) do" don't always have the same answer, though of course safe code can at least rely on unsafe code not triggering UB.

@ExpHP
Copy link

ExpHP commented Jan 3, 2018

In response specifically to the conclusion that all ZSTs can be safely copied; if this were true, it would spell very bad news for all existing Drop impls for ZSTs.

@bluss
Copy link
Member

bluss commented Jan 4, 2018

@ExpHP I'm not convinced a ZST can have a sensible Drop impl

(Just to be clear: I'm very much on the side that we can't copy things that are not Copy, that breaks type safety.. and we use type safety to have memory safety).

@Amanieu
Copy link
Member

Amanieu commented Jan 4, 2018

@bluss A ZST can be used as a proxy for global state. For example, a SignalGuard type which blocks all signals while it exists and re-enables signals when it is dropped (with a reference count to handle multiple SignalGuards).

@RalfJung
Copy link
Member

RalfJung commented Jan 4, 2018

In response specifically to the conclusion that all ZSTs can be safely copied

I am not sure where you are getting this conclusion from. It is not correct.

I specifically argue above that ZSTs may not be copied by unsafe Code! Copying them does not raise immediate UB, but that doesn't make it okay.

Or are you just confirming that it should not be okay? In that case I agree :D

@ExpHP
Copy link

ExpHP commented Jan 4, 2018

Or are you just confirming that it should not be okay? In that case I agree :D

Yes. 🙂

@scottmcm
Copy link
Member Author

scottmcm commented Jan 7, 2018

Since the memory regions occupied by ZSTs are, obviously, zero-sized, there can never be any overlap, even for pointers with the same address.

Does that mean, then, that the following function is well-defined?

fn zst_index2<T>(x: &mut Vec<T>, (i, j): (usize, usize)) -> (&mut T, &mut T) {
    assert_eq!(std::mem::size_of::<T>(), 0);
    let ps = (
        &mut x[i] as *mut _,
        &mut x[j] as *mut _,
    );
    unsafe {
        (
            &mut* ps.0,
            &mut* ps.1,
        )
    }
}

#[derive(Debug)]
struct MyZst;

fn main() {
    let mut v = vec![MyZst];
    let p = zst_index2(&mut v, (0, 0));
    println!("{:?}", p);
}

https://play.rust-lang.org/?gist=0ef8d9bbbbc8fd477f2e2fc1be0ccb1d&version=stable

(Apologies for the distraction about cloning; I'll move that discussion elsewhere.)

@Amanieu
Copy link
Member

Amanieu commented Jan 7, 2018

Yes it is well defined.

If it helps your mental model, think of it like this: each memory address contains an infinite number of distinct ZSTs. Every time you dereference a ZST pointer, you are accessing a new, separate instance of a ZST. Therefore it is impossible to have aliasing with ZSTs: all ZST pointers point to a different ZST instance, even if they have the same address.

@Amanieu
Copy link
Member

Amanieu commented Jan 7, 2018

Yes it is well defined.

Actually, let me clarify that. The code that you have given will not trigger any UB due to aliasing violations. However it may violate the semantics of the MyZst type since you are effectively creating a MyZst out of thin air (for example, MyZst may have a private field and thus requires you to call a particular function to create one). Therefore, although this function on its own can't cause UB, it should be an unsafe fn.

@bluss
Copy link
Member

bluss commented Jan 7, 2018

@scottmcm I agree with @Amanieu that your function does not preserve the ownership semantics of the T values that the Vec<T> owns. Does not seem to be related to aliasing. Since normally, aliasing is about competing writes and reads, and we have no writes or reads in this topic, that kind of aliasing cannot occur. No writes, no aliasing problems.

@scottmcm
Copy link
Member Author

scottmcm commented Jan 7, 2018

Ok, so checking for aliasing is not enough to know that two &muts can co-exist. Shame.

@scottmcm scottmcm closed this as completed Jan 7, 2018
@RalfJung
Copy link
Member

I agree with @Amanieu and @bluss. And, in fact, I think my formal model in RustBelt also agrees: Owning an &mut T for some lifetime that is alive is effectively owning two things: The memory covered by the pointer, and whatever "owning T" means. The first part is trivial if T is zero-sized, but the second part is (almost...) entirely independent of that. Owning two &mut T that are both alive means, in particular, that we own the "owning T" part twice -- once for each mutable reference.

For private types or types with private fields, libraries get to pick what "owning T" means. So, if "owning MyZst" means something that can only ever be owned once, then zst_index2 cannot be proven to inhabit its type: To satisfy its return type, it has to cough up two "owning MyZst", but only one can ever exist.

@RalfJung
Copy link
Member

RalfJung commented Jan 25, 2018

@scottmcm

Ok, so checking for aliasing is not enough to know that two &muts can co-exist. Shame.

Notice that this would be the case even without considering ZSTs. For example, there may never be two &mut MutexGuard for the same mutex, even if the guards themselves are disjoint in terms of the memory they occupy.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants