This commit rewrites the runtime crate to provide safety in the face
of recursive calls to the guest. The basic principle is that
`GuestMemory` is now a trait which dynamically returns the
pointer/length pair. This also has an implicit contract (hence the
`unsafe` trait) that the pointer/length pair point to a valid list of
bytes in host memory "until something is reentrant".
After this changes the various suite of `Guest*` types were rewritten.
`GuestRef` and `GuestRefMut` were both removed since they cannot safely
exist. The `GuestPtrMut` type was removed for simplicity, and the final
`GuestPtr` type subsumes `GuestString` and `GuestArray`. This means
that there's only one guest pointer type, `GuestPtr<'a, T>`, where `'a`
is the borrow into host memory, basically borrowing the `GuestMemory`
trait object itself.
Some core utilities are exposed on `GuestPtr`, but they're all 100%
safe. Unsafety is now entirely contained within a few small locations:
* Implementations of the `GuestType` for primitive types (e.g. `i8`,
`u8`, etc) use `unsafe` to read/write memory. The `unsafe` trait of
`GuestMemory` though should prove that they're safe.
* `GuestPtr<'_, str>` has a method which validates utf-8 contents, and
this requires `unsafe` internally to read all the bytes. This is
guaranteed to be safe however given the contract of `GuestMemory`.
And that's it! Everything else is a bunch of safe combinators all built
up on the various utilities provided by `GuestPtr`. The general idioms
are roughly the same as before, with various tweaks here and there. A
summary of expected idioms are:
* For small values you'd `.read()` or `.write()` very quickly. You'd
pass around the type itself.
* For strings, you'd pass `GuestPtr<'_, str>` down to the point where
it's actually consumed. At that moment you'd either decide to copy it
out (a safe operation) or you'd get a raw view to the string (an
unsafe operation) and assert that you won't call back into wasm while
you're holding that pointer.
* Arrays are similar to strings, passing around `GuestPtr<'_, [T]>`.
Arrays also have a `iter()` method which yields an iterator of
`GuestPtr<'_, T>` for convenience.
Overall there's still a lot of missing documentation on the runtime
crate specifically around the safety of the `GuestMemory` trait as well
as how the utilities/methods are expected to be used. Additionally
there's utilities which aren't currently implemented which would be easy
to implement. For example there's no method to copy out a string or a
slice, although that would be pretty easy to add.
In any case I'm curious to get feedback on this approach and see what
y'all think!
This commit puts context object, i.e., the implementor of the
WASI snapshot, behind a reference `&self` rather than a mutable
reference `&mut self`. As suggested by @alexcrichton, this gives
the implementor the possibility to determine how it handles its
interior mutability.
Currently, we create an immutable `GuestPtr` from `self` and call
`as_raw` on it which correctly returns `*const u8`. However, since
we're dealing with `GuestPtrMut` I thought it might make more sense
to return `*mut u8` directly instead. This will be needed (and will
save us from silly casts `*const _ as *mut _`) in plugging in
`Iovec<'_>` into `std::io::IoSliceMut` in `fd_read` and `fd_pread` calls.
* Add WASI spec (minus unions)
* Fill in all WASI shims
* Clean up derives and fix noncopy struct write method
This commit does three things:
* it uses the full, current snapshot1 WASI spec as a compilation test
* it fixes noncopy struct write method (which was incorrectly resolved
in certain cases to the inherent method of the `GuestPtrMut` rather
than the interface method `GuestType::write`
* it cleans up derives for structs and unions which should not auto-derive
`PartialEq`, `Eq`, or `Hash` since their members are not guaranteed to
be compatible
While public might be an overkill, until we successfully merge
`wiggle` with `wasi-common` (and others), I suggest we just make
the modules fully public and work from there.
this allows us to reuse the code in wiggle-generate elsewhere, because
a proc-macro=true lib can only export a #[proc_macro] and not ordinary
functions.
In lucet, I will depend on wiggle-generate to define a proc macro that
glues wiggle to the specifics of the runtime.
This commit splits `generate/src/types.rs` into submodules, each
responsible for specifying a (compound) type. `generate/src/types.rs`
grew in size a lot, and IMHO this should yield readability and maintenance.
This commit refactors trait system for guest types. Namely, as
discussed offline on zulip, `GuestType` now includes `GuestTypeClone`,
whereas `GuestTypeCopy` has been renamed to `GuestTypeTransparent`.
While working on the full WASI spec, it turned out that we need two
tweaks to `GuestError`:
1. we need to support conversion from `TryFromIntError`, and
2. we need to invoke `e.into()` when unwrapping the result of `try_into()`
in auto-implementation of raw interface functions.
Both problems seem to originate for "transparent" builtin types since
for those we don't really provide a `TryFrom` implementation like for
compound types, e.g., enums, flags, etc.
these names were artifacts of some early confusion / bad design i made
in the traits. read and write are much simpler names!
also, change a ptr_mut to ptr where we just read the contents in the
argument marshalling for structs. this has no effect, but it is more
correct.
* Allow returning structs if copy
This commit does three things:
1. enables marshalling of structs as return args from interface funcs
but so far *only* for the case when the struct itself is copy
2. puts bits that use `std::convert::TryInto` in a local scope to avoid
multiple reimports
3. for added clarity, we now print for which `tref` type the marshalling
of results is unimplemented
The first case (1.) is required to make `fd_fdstat_get` WASI interface
func work which returns `Fdstat` struct (which is copy). The second
case (2.) caused me some grief somewhere along the lines when I was
playing with snapshot1. Putting the code that requires it inside a local
scope fixed all the issues
* Add proptests for returing struct if copyable
* Use write_ptr_to_guest to marshal value to guest
* Successfully return non-copy struct
This commit aligns `wiggle` a little bit closer with Rust proper in
the sense that now `GuestTypeCopy` implies `GuestTypeClone` which
in turn implies that any type implementing `GuestTypeCopy` will have
to provide `read_from_guest` and `write_to_guest` methods. As a result,
we can safely revert changes introduced in the previous commit.
This commit fixes stubs for struct members that are `GuestTypeCopy`
but are not `GuestTypeClone`. In this case, we cannot rely on methods
`T::read_from_guest` or `T::write_to_guest` since these are only
available if `T: GuestTypeClone`. In those cases, we can and should
dereference the location pointer to `T` and copy the result in/out
respectively.
* Implement fmt::Display for enums
`wasi_common` relies on `strerror` to nicely format error messages.
`strerror` is autoimplemented in `wig`. I thought it might be useful
to provide a Rust-idiomatic alternative which boils down to autoimplementing
`fmt::Display` for all enums.
* Implement fmt::Display for flags
* Implement fmt::Display for ints
* Draft out IntDatatype in wiggle-generate
This commit drafts out basic layout for `IntDatatype` structure in
`wiggle`. As it currently stands, an `Int` type is represented as
a one-element tuple struct much like `FlagDatatype`, however, with
this difference that we do not perform any checks on the input
underlying representation since any value for the prescribed type
is legal.
* Finish drafting IntDatatype support in wiggle
This commit adds necessary marshal stubs to properly pass `IntDatatype`
in and out of interface functions. It also adds a basic proptest.
* Add basic GuestString support to wiggle
This commit adds basic `GuestString` support to `wiggle`. `GuestString`
is a wrapper around `GuestArray<'_, u8>` array type which itself can
be made into either an owned (cloned) Rust `String` or borrowed as
a reference `&str`. In both cases, `GuestString` ensures that the
underlying bytes are valid Unicode code units, throwing a `InvalidUtf8`
error if not.
This commit adds support *only* for passing in strings as arguments
in WASI. Marshalling of the return arg has not yet been implemented.
I'm not even sure it's possible without multi-value return args
feature of Wasm. It's not a major setback especially since the WASI
spec (and this includes even the `ephemeral` snapshot) doesn't
return strings anywhere. They are only ever passed in as arguments
to interface functions.
It should be noted that error returned in case of invalid UTF-8
requires a lot more love as it doesn't include anything besides
flagging an event that the string contained an invalid Unicode code unit.
* Borrow all of string's memory including nul-byte
Borrow all of string's underlying memory including the nul-byte.
This perhaps might not have a tremendous impact on anything, but
since the nul-byte is technically part of the WASI string, we should
include it in the borrow as well.
* Fill in wiggle-generate blanks for strings
* Print to screen passed string in proptest
* Strings are PointerLengthPairs!
* Fix generation of strings in compound types
* Update test with simple string strategy
* Generate better test strings
* Finalise proptest for strings
* Fix formatting
* Update crates/runtime/src/memory/string.rs
Removes unnecessary comment in code
* Apply Pat's suggestion to wrap Utf8Error as error
* merge GuestTypePtr and GuestTypeClone into a single GuestTypeClone<'a> trait
* GuestArray can derive Clone (but not impl GuestTypeClone)
* fix array tests
* Fix GuestTypeClone for flags
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
* Adds support for flags datatype
This commit adds support for `FlagsDatatype`. In WASI, flags are
represented as bitfields/bitflags, therefore, it is important
for the type to have bitwise operations implemented, and some
way of checking whether the current set of flags contains some
other set (i.e., they intersect). Thus, this commit automatically
derives `BitAnd`, etc. for the derived flags datatype. It also
automatically provides an `ALL_FLAGS` value which corresponds to
a bitwise-or of all flag values present and is provided for
convenience.
* Simplify read_from_guest
In preparation for landing `string` support in `wiggle`, I've
revisited `GuestArray` and the deref mechanics, and decided to
scrap the current approach in favour of the previous. Namely, for
`T: GuestTypeCopy` we provide `GuestArray::as_ref` which then can
be derefed directly to `&[T]` since the guest and host types have
matching representation, and for other types (thinking here of
our infamous arrays of pointers), I've instead added a way to
iterate over the pointers. Then, it is up to the `wiggle`'s client
to know whether they can deref whatever the pointer stored in array
is pointing at.
* Add array generation to wiggle-generate crate
This commit:
* adds array generation to `wiggle-generate` crate which implies
that we can now generate arrays from `witx` files
* introduces two test interface functions `foo::reduce_excuses` and
`foo::populate_excuses`, and adds matching prop-tests
* adds an out-of-boundary check to `HostMemory::mem_area_strat` since
now, given we're generating arrays for testing with an arbitrary
but bounded number of elements, it is possible to violate the boundary
* refactors `Region::extend` to a new signature `extend(times: u32)`
which multiplies the current pointer `len` by `times`
* fixes bug in `GuestArray::as_ref` and `GuestArrayMut::as_ref_mut` methods
where we were not validating the first element (we always started the
validation from the second element)
* Fix generation of arrays in witx
This commit fixes how `arrays` are auto-generated from `witx` files.
In particular, the changes include:
* Since by design every `array` in `witx` represents an immutable
slab of memory, we will only ever operate on `GuestArray` in which
case I've gone ahead and removed `GuestArrayMut` so as to unclutter
the code somewhat. If we find ourselves in need for it in the future,
I reckon it will be better to write it from scratch (since the codebase
will inevitably evolve by then considerably) rather than maintaining an
unused implementation.
* I've rewritten `GuestArrayRef<T>` to be a wrapper for `Vec<GuestRef<T>>`.
Also, `GuestArray::as_ref` now borrows each underlying "element" of the
array one-by-one rather than borrowing the entire chunk of memory at once.
This change is motivated by the inability to coerce type parameter `T` in
`GuestArray<T>` in more complicated cases such as arrays of guest pointers
`GuestPtr<T>` to `*const T` for reuse in `std::slice::from_raw_parts` call.
(In general, the size of Wasm32 pointer is 4 bytes, while
```
std::mem::size_of::<T>() == std::mem::size_of::<GuestPtr<S>>() == 16
```
which is problematic; i.e., I can't see how I could properly extract guest
pointers from slices of 4 bytes and at the same time not allocate.)
* I've augmented fuzz tests by (re-)defining two `array` types:
```
(typename $const_excuse_array (array (@witx const_pointer $excuse)))
(typename $excuse_array (array (@witx pointer $excuse)))
```
This should hopefully emulate and test the `iovec` and `ciovec` arrays
present in WASI spec.
* Refactor naming and crates info
This commit:
* changes workspace crates to have a `wiggle_` prefix in names
* rename `memory` module of `wiggle-memory` crate to `runtime`
* fixes authors of all crates
* Rename wiggle memory crate to runtime
This commit refactors and reorganises the `memory.rs` module. Now,
it consists of two submodules: `memory::ptr` which contains `GuestPtr`
and `GuestRef` (and their mutable versions), and `memory::array` which
contains `GuestArray` and `GuestArrayRef` (and their mutable versions).
This commit also adds basic unit sanity tests for the `memory::ptr`
submodule.
* Add first pass at GuestArray
* Return &'a [T] instead of Vec<Refs>
Also, add a sanity test for `GuestArray`.
* Change approach in GuestArray::as_ref
The new approach should avoid unnecessary copying (does it?) by
iterating through the memory and firstly validating the guest pointers,
to then extracting the slice `&'a [T]` using the unsafe `slice::from_raw_parts`
fn.
* Redo implementation of GuestArray and GuestArrayMut
This commit:
* redos the impl of `as_ref` and `as_ref_mut` for `GuestArray` and
`GuestArrayMut` structs in that they now return dynamically borrow
checked `GuestArrayRef` and `GuestArrayRefMut` structs
* introduces `GuestArrayRef` and `GuestArrayRefMut` structs which
perform dynamic borrow-checking of memory region at runtime, and
can be derefed to `&[T]` and `&mut [T]`
* adds a few sanity checks for the introduced types
* Rename r#ref to ref_
* Add constructors for GuestArray and GuestArrayMut
This commit:
* adds constructors for `GuestArray` and `GuestArrayMut` both of
which can now *only* be constructed from `GuestPtr::array` and
`GuestPtrMut::array_mut` respectively
* changes `Region::extend` to extend the region by adding to the
current `len` (avoids problem of asserting for > 0)
* implements `fmt::Debug` for most of memory types for easier testing
(implementation is *not* enforced on the generic parameter `T` in
the struct; rather, if `T` impls `fmt::Debug` then so does the
memory type such as `GuestPtr<'_, T>`)
* Add some basic sanity tests for Region
This commit adds some basic sanity tests for `overlap` method
of `Region`.
* Refactor overlaps method of Region struct
This commit refactors `Region::overlaps` method.
* Add some docs
* Assert Region's len is nonzero
* generator: take an &mut GuestMemory
rather than pass the owned GuestMemory in, just give exclusive access
to it. Makes testing easier.
* tests: start transforming tests to check abi-level generated code as well
* finish lowering of test funcs
* tests: rename variables to more sensible names
* proptesting: reliably finds that we dont allow stuff to be right against end of memory!
* memory: fix off-by-one calc in GuestMemory::contains(&self, Region)
ty proptest!
also, refactored the Region::overlaps to be the same code but easier to
read.
* generator: better location information in GuestError
* testing: proptest generates memory areas, tests everything
* Add some (incomplete set) basic sanity end-to-end tests
This commit adds some (an incomplete set of) basic sanity end-to-end
tests. It uses `test.witx` to autogenerate types and module interface
functions (aka the syscalls), and tests their implementation. For
the host memory, it uses simplistic `&mut [u8]` where we have full
control of the addressing and contents.
* Add sanity test for baz interface func
This commit adds a sanity test for the `Foo::baz` interface func.
* Upcast start/len for Region to avoid overflow
* Reenable alignment checking for memory
* use an array to implement hostmemory
Co-authored-by: Pat Hickey <pat@moreproductive.org>