Skip to content
This repository has been archived by the owner on Nov 9, 2019. It is now read-only.

Make memory fns safe wherever possible #12

Merged
merged 1 commit into from
May 13, 2019
Merged

Make memory fns safe wherever possible #12

merged 1 commit into from
May 13, 2019

Conversation

kubkon
Copy link
Member

@kubkon kubkon commented May 12, 2019

As suggested in #10, I've reworked parts of memory.rs so that the vast majority of the enc/dec fns are now safe. This took away quite a few unsafe calls in the hostcall impl code. @sunfishcode lemme know if this is what you've had in mind.

While at it, I've also done some minor refactoring to some hostcall impl functions.

@sunfishcode
Copy link
Member

Nice!

Can we also make dec_ptr safe? Instead of memory.offset(ptr), we could use memory.get(ptr, ptr + len), and if that succeeds, .as_mut_ptr() on the resulting slice.

I'd find it more consistent for dec_ptr to return EFAULT rather than EOVERFLOW in the case where len is greater than wasm32::UINTPTR_MAX -- EOVERFLOW is mostly used in cases where values being returned to the user don't fit in the user-provided data types, and in this case, we're not returning data back to the user. Using EFAULT here also means we could eliminate the explicit bounds checks -- the get call will check everything and you can just return EFAULT if it fails.

As one more detail while here, the ptr + len should be a checked_add to handle overflow there as well (and that should also fail with EFAULT for similar reasons.)

@kubkon kubkon force-pushed the safe-mem branch 2 times, most recently from 17bdf68 to 2ee1635 Compare May 13, 2019 06:52
@kubkon
Copy link
Member Author

kubkon commented May 13, 2019

Done. I have to admit that I wasn't sure about the signature of dec_ptr, i.e., whether it should be safe or unsafe as it returns a raw pointer. But I've convinced myself that it's valid and consistent with Rust's stdlib (e.g., std::slice::as_mut_ptr is marked safe even though it returns a raw pointer as well).

I've also changed EOVERFLOW to EFAULT, and essentially ported over checked_add logic from wasmtime-wasi. Question here though: should we worry about len == 0 in dec_ptr?

@sunfishcode
Copy link
Member

Done. I have to admit that I wasn't sure about the signature of dec_ptr, i.e., whether it should be safe or unsafe as it returns a raw pointer. But I've convinced myself that it's valid and consistent with Rust's stdlib (e.g., std::slice::as_mut_ptr is marked safe even though it returns a raw pointer as well).

Yeah, from a language design point of view, Rust's choice here is debatable, but given that it works this way, it makes sense for us to do the same thing.

I've also changed EOVERFLOW to EFAULT, and essentially ported over checked_add logic from wasmtime-wasi. Question here though: should we worry about len == 0 in dec_ptr?

That's a great question :-). I think the main interesting place where this would come up in wasm is in the bulk memory operations. Looking into it, it appears that the current proposal isn't clear on what happens with an empty slice at the end of memory, so I filed WebAssembly/bulk-memory-operations#86. On the Rust side, it isn't clear from the documentation what get does on an empty slice at the end of a slice either, so I filed rust-lang/rust#60783.

It appears the intent of both WebAssembly linear memory and Rust have the behavior that a zero length is ok, including at the end of memory/slice, but not past the end of memory/slice, so we can just rely on get's bounds checking.

@kubkon
Copy link
Member Author

kubkon commented May 13, 2019

Done. I have to admit that I wasn't sure about the signature of dec_ptr, i.e., whether it should be safe or unsafe as it returns a raw pointer. But I've convinced myself that it's valid and consistent with Rust's stdlib (e.g., std::slice::as_mut_ptr is marked safe even though it returns a raw pointer as well).

Yeah, from a language design point of view, Rust's choice here is debatable, but given that it works this way, it makes sense for us to do the same thing.

Agreed, although in Rust's defence, you still need an unsafe block to dereference the raw pointer anyway, so it might not be as bad as it looks ;-)

I've also changed EOVERFLOW to EFAULT, and essentially ported over checked_add logic from wasmtime-wasi. Question here though: should we worry about len == 0 in dec_ptr?

That's a great question :-). I think the main interesting place where this would come up in wasm is in the bulk memory operations. Looking into it, it appears that the current proposal isn't clear on what happens with an empty slice at the end of memory, so I filed WebAssembly/bulk-memory-operations#86. On the Rust side, it isn't clear from the documentation what get does on an empty slice at the end of a slice either, so I filed rust-lang/rust#60783.

It appears the intent of both WebAssembly linear memory and Rust have the behavior that a zero length is ok, including at the end of memory/slice, but not past the end of memory/slice, so we can just rely on get's bounds checking.

Oh cool! Thanks for clarifying it and filing the PR and the issue!

Unless you have any other observations, I reckon the PR looks pretty good now, and I'm happy we managed to ged rid of a fair chunk of unsafe-{ty, ness} from the code. :-)

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience here! My sense here is that having a simple and safe dec_ptr will be worth the effort :-).

src/memory.rs Outdated
memory: &mut [u8],
ptr: wasm32::uintptr_t,
len: usize,
) -> Result<*mut u8, host::__wasi_errno_t> {
// check that `len` fits in the wasm32 address space
if len > wasm32::UINTPTR_MAX as usize {
bail_errno!(__WASI_EOVERFLOW);
bail_errno!(__WASI_EFAULT);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check turns out to be redundant, because a wasm32 program will never have a memory larger than wasm32::UINTPTR_MAX, so ptr + len will simply be out of bounds if len is too great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point!

src/memory.rs Outdated
if len > 0 {
(ptr as usize)
.checked_add(len - 1)
.ok_or(host::__WASI_EFAULT)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks for overflow with ptr + (len - 1), but below we actually compute ptr + len, so currently this overflow check doesn't perfectly match the code below. I don't think it's an observable difference in this case, but I'd still like to straighten this out, as this is in foundational code we'll build a lot of other things on :-).

The - 1 handles the case where the subrange is at the very end of the index space, however this add is happening under the host usize index space, so the only way ptr + len could overflow where ptr + (len - 1) couldn't, without getting caught by the bounds check, is if memory is the same size as the entire host address space, and that's not possible :-).

So first suggestion: do the checked_add without the - 1. Then on the success path, save the checked_add result in a local variable and use that below, rather than doing a separate unchecked add, as that makes it more obvious that the actual value used below is the checked one.

(In theory, an exotic host could have a usize which isn't big enough to index the entire address space, and allow objects larger than isize::max_value() which is not trivial to do. If anyone ports this code to such a platform, the change I suggest here would cause the code to return EFAULT in a case where it technically shouldn't, so it'll "fail closed", which I posit is adequate for now.

Another option would be to use an inclusive range with a - 1 like memory.get_mut(ptr as usize..=ptr as usize + (len - 1)), however that would require special-casing when len is 0, and my sense is this issue is not worth that complexity.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch about the bounds check here. I should have noticed it but somehow slipped through! ;-)

Personally I also prefer the first suggestion over the second one. IMHO, less nesting means better code and most importantly readability.

@kubkon
Copy link
Member Author

kubkon commented May 13, 2019

Thanks for your patience here! My sense here is that having a simple and safe dec_ptr will be worth the effort :-).

Totally agree here, so no need to thank me -- always up for a clean and safe implementation especially when it comes to something as mission-critical as memory management! :-)

@sunfishcode
Copy link
Member

Cool, I'm excited about all these functions switching from unsafe to safe! Of course there still are some unsafe parts, but they're much easier to see and understand now.

@sunfishcode sunfishcode merged commit 8b09f32 into master May 13, 2019
@kubkon kubkon deleted the safe-mem branch May 14, 2019 05:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants