Rewrite for recursive safety

This commit rewrites the runtime crate to provide safety in the face
of recursive calls to the guest. The basic principle is that
`GuestMemory` is now a trait which dynamically returns the
pointer/length pair. This also has an implicit contract (hence the
`unsafe` trait) that the pointer/length pair point to a valid list of
bytes in host memory "until something is reentrant".

After this changes the various suite of `Guest*` types were rewritten.
`GuestRef` and `GuestRefMut` were both removed since they cannot safely
exist. The `GuestPtrMut` type was removed for simplicity, and the final
`GuestPtr` type subsumes `GuestString` and `GuestArray`. This means
that there's only one guest pointer type, `GuestPtr<'a, T>`, where `'a`
is the borrow into host memory, basically borrowing the `GuestMemory`
trait object itself.

Some core utilities are exposed on `GuestPtr`, but they're all 100%
safe. Unsafety is now entirely contained within a few small locations:

* Implementations of the `GuestType` for primitive types (e.g. `i8`,
  `u8`, etc) use `unsafe` to read/write memory. The `unsafe` trait of
  `GuestMemory` though should prove that they're safe.

* `GuestPtr<'_, str>` has a method which validates utf-8 contents, and
  this requires `unsafe` internally to read all the bytes. This is
  guaranteed to be safe however given the contract of `GuestMemory`.

And that's it! Everything else is a bunch of safe combinators all built
up on the various utilities provided by `GuestPtr`. The general idioms
are roughly the same as before, with various tweaks here and there. A
summary of expected idioms are:

* For small values you'd `.read()` or `.write()` very quickly. You'd
  pass around the type itself.

* For strings, you'd pass `GuestPtr<'_, str>` down to the point where
  it's actually consumed. At that moment you'd either decide to copy it
  out (a safe operation) or you'd get a raw view to the string (an
  unsafe operation) and assert that you won't call back into wasm while
  you're holding that pointer.

* Arrays are similar to strings, passing around `GuestPtr<'_, [T]>`.
  Arrays also have a `iter()` method which yields an iterator of
  `GuestPtr<'_, T>` for convenience.

Overall there's still a lot of missing documentation on the runtime
crate specifically around the safety of the `GuestMemory` trait as well
as how the utilities/methods are expected to be used. Additionally
there's utilities which aren't currently implemented which would be easy
to implement. For example there's no method to copy out a string or a
slice, although that would be pretty easy to add.

In any case I'm curious to get feedback on this approach and see what
y'all think!
This commit is contained in:
Alex Crichton
2020-03-04 10:21:34 -08:00
parent 3764204250
commit ca9f33b6d9
28 changed files with 751 additions and 2013 deletions

View File

@@ -1,13 +1,217 @@
mod borrow;
use std::cell::Cell;
use std::slice;
use std::str;
use std::marker;
use std::fmt;
mod error;
mod guest_type;
mod memory;
mod region;
pub use error::GuestError;
pub use guest_type::{GuestErrorType, GuestType, GuestTypeTransparent};
pub use memory::{
GuestArray, GuestMemory, GuestPtr, GuestPtrMut, GuestRef, GuestRefMut, GuestString,
GuestStringRef,
};
pub use guest_type::{GuestErrorType, GuestType};
pub use region::Region;
pub unsafe trait GuestMemory {
fn base(&self) -> (*mut u8, u32);
fn validate_size_align(
&self,
offset: u32,
align: usize,
len: u32,
) -> Result<*mut u8, GuestError> {
let (base_ptr, base_len) = self.base();
let region = Region { start: offset, len };
// Figure out our pointer to the start of memory
let start = match (base_ptr as usize).checked_add(offset as usize) {
Some(ptr) => ptr,
None => return Err(GuestError::PtrOutOfBounds(region)),
};
// and use that to figure out the end pointer
let end = match start.checked_add(len as usize) {
Some(ptr) => ptr,
None => return Err(GuestError::PtrOutOfBounds(region)),
};
// and then verify that our end doesn't reach past the end of our memory
if end > (base_ptr as usize) + (base_len as usize) {
return Err(GuestError::PtrOutOfBounds(region));
}
// and finally verify that the alignment is correct
if start % align != 0 {
return Err(GuestError::PtrNotAligned(region, align as u32));
}
Ok(start as *mut u8)
}
fn ptr<'a, T>(&'a self, offset: T::Pointer) -> GuestPtr<'a, T>
where
Self: Sized,
T: ?Sized + Pointee,
{
GuestPtr::new(self, offset)
}
}
unsafe impl<'a, T: ?Sized + GuestMemory> GuestMemory for &'a T {
fn base(&self) -> (*mut u8, u32) {
T::base(self)
}
}
unsafe impl<'a, T: ?Sized + GuestMemory> GuestMemory for &'a mut T {
fn base(&self) -> (*mut u8, u32) {
T::base(self)
}
}
pub struct GuestPtr<'a, T: ?Sized + Pointee> {
mem: &'a (dyn GuestMemory + 'a),
pointer: T::Pointer,
_marker: marker::PhantomData<&'a Cell<T>>,
}
impl<'a, T: ?Sized + Pointee> GuestPtr<'a, T> {
pub fn new(mem: &'a (dyn GuestMemory + 'a), pointer: T::Pointer) -> GuestPtr<'_, T> {
GuestPtr {
mem,
pointer,
_marker: marker::PhantomData,
}
}
pub fn offset(&self) -> T::Pointer {
self.pointer
}
pub fn mem(&self) -> &'a (dyn GuestMemory + 'a) {
self.mem
}
pub fn cast<U>(&self) -> GuestPtr<'a, U>
where
T: Pointee<Pointer = u32>,
{
GuestPtr::new(self.mem, self.pointer)
}
pub fn read(&self) -> Result<T, GuestError>
where
T: GuestType<'a>,
{
T::read(self)
}
pub fn write(&self, val: T) -> Result<(), GuestError>
where
T: GuestType<'a>,
{
T::write(self, val)
}
pub fn add(&self, amt: u32) -> Result<GuestPtr<'a, T>, GuestError>
where T: GuestType<'a> + Pointee<Pointer = u32>,
{
let offset = amt.checked_mul(T::guest_size())
.and_then(|o| self.pointer.checked_add(o));
let offset = match offset {
Some(o) => o,
None => return Err(GuestError::InvalidFlagValue("")),
};
Ok(GuestPtr::new(self.mem, offset))
}
}
impl<'a, T> GuestPtr<'a, [T]> {
pub fn offset_base(&self) -> u32 {
self.pointer.0
}
pub fn len(&self) -> u32 {
self.pointer.1
}
pub fn iter<'b>(&'b self) -> impl ExactSizeIterator<Item = Result<GuestPtr<'a, T>, GuestError>> + 'b
where
T: GuestType<'a>,
{
let base = GuestPtr::new(self.mem, self.offset_base());
(0..self.len()).map(move |i| base.add(i))
}
}
impl<'a> GuestPtr<'a, str> {
pub fn offset_base(&self) -> u32 {
self.pointer.0
}
pub fn len(&self) -> u32 {
self.pointer.1
}
pub fn as_bytes(&self) -> GuestPtr<'a, [u8]> {
GuestPtr::new(self.mem, self.pointer)
}
pub fn as_raw(&self) -> Result<*mut str, GuestError> {
let ptr = self.mem.validate_size_align(self.pointer.0, 1, self.pointer.1)?;
// TODO: doc unsafety here
unsafe {
let s = slice::from_raw_parts_mut(ptr, self.pointer.1 as usize);
match str::from_utf8_mut(s) {
Ok(s) => Ok(s),
Err(e) => Err(GuestError::InvalidUtf8(e))
}
}
}
}
impl<T: ?Sized + Pointee> Clone for GuestPtr<'_, T> {
fn clone(&self) -> Self {
*self
}
}
impl<T: ?Sized + Pointee> Copy for GuestPtr<'_, T> {}
impl<T: ?Sized + Pointee> fmt::Debug for GuestPtr<'_, T> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
T::debug(self.pointer, f)
}
}
mod private {
pub trait Sealed {}
impl<T> Sealed for T {}
impl<T> Sealed for [T] {}
impl Sealed for str {}
}
pub trait Pointee: private::Sealed {
#[doc(hidden)]
type Pointer: Copy;
#[doc(hidden)]
fn debug(pointer: Self::Pointer, f: &mut fmt::Formatter) -> fmt::Result;
}
impl<T> Pointee for T {
type Pointer = u32;
fn debug(pointer: Self::Pointer, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "*guest {:#x}", pointer)
}
}
impl<T> Pointee for [T] {
type Pointer = (u32, u32);
fn debug(pointer: Self::Pointer, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "*guest {:#x}/{}", pointer.0, pointer.1)
}
}
impl Pointee for str {
type Pointer = (u32, u32);
fn debug(pointer: Self::Pointer, f: &mut fmt::Formatter) -> fmt::Result {
<[u8]>::debug(pointer, f)
}
}