tracing is already the dep that wiggle uses.
I used tracing structured arguments wherever I could, but I skipped over
it in all of the snapshot_0 code, because I'm going to delete that code
and replace it with wiggle-based stuff real soon.
* Rename `OFlag`/`AtFlag` to `OFlags`/`AtFlags`.
This makes them consistent with `PollFlags` and common usage of
bitflags types in Rust code in general.
POSIX does tend to use names like `oflag` and `flag`, so this is in mild
disagreement with POSIX style, however I find this particular aspects of
POSIX confusing because these values hold multiple flags.
* rustfmt
* Introduce strongly-typed system primitives
This commit does a lot of reshuffling and even some more. It introduces
strongly-typed system primitives which are: `OsFile`, `OsDir`, `Stdio`,
and `OsOther`. Those primitives are separate structs now, each implementing
a subset of `Handle` methods, rather than all being an enumeration of some
supertype such as `OsHandle`. To summarise the structs:
* `OsFile` represents a regular file, and implements fd-ops
of `Handle` trait
* `OsDir` represents a directory, and primarily implements path-ops, plus
`readdir` and some common fd-ops such as `fdstat`, etc.
* `Stdio` represents a stdio handle, and implements a subset of fd-ops
such as `fdstat` _and_ `read_` and `write_vectored` calls
* `OsOther` currently represents anything else and implements a set similar
to that implemented by `Stdio`
This commit is effectively an experiment and an excercise into better
understanding what's going on for each OS resource/type under-the-hood.
It's meant to give us some intuition in order to move on with the idea
of having strongly-typed handles in WASI both in the syscall impl as well
as at the libc level.
Some more minor changes include making `OsHandle` represent an OS-specific
wrapper for a raw OS handle (Unix fd or Windows handle). Also, since `OsDir`
is tricky across OSes, we also have a supertype of `OsHandle` called
`OsDirHandle` which may store a `DIR*` stream pointer (mainly BSD). Last but not
least, the `Filetype` and `Rights` are now computed when the resource is created,
rather than every time we call `Handle::get_file_type` and `Handle::get_rights`.
Finally, in order to facilitate the latter, I've converted `EntryRights` into
`HandleRights` and pushed them into each `Handle` implementor.
* Do not adjust rights on Stdio
* Clean up testing for TTY and escaping writes
* Implement AsFile for dyn Handle
This cleans up a lot of repeating boilerplate code todo with
dynamic dispatch.
* Delegate definition of OsDir to OS-specific modules
Delegates defining `OsDir` struct to OS-specific modules (BSD, Linux,
Emscripten, Windows). This way, `OsDir` can safely re-use `OsHandle`
for raw OS handle storage, and can store some aux data such as an
initialized stream ptr in case of BSD. As a result, we can safely
get rid of `OsDirHandle` which IMHO was causing unnecessary noise and
overcomplicating the design. On the other hand, delegating definition
of `OsDir` to OS-specific modules isn't super clean in and of itself
either. Perhaps there's a better way of handling this?
* Check if filetype of OS handle matches WASI filetype when creating
It seems prudent to check if the passed in `File` instance is of
type matching that of the requested WASI filetype. In other words,
we'd like to avoid situations where `OsFile` is created from a
pipe.
* Make AsFile fallible
Return `EBADF` in `AsFile` in case a `Handle` cannot be made into
a `std::fs::File`.
* Remove unnecessary as_file conversion
* Remove unnecessary check for TTY for Stdio handle type
* Fix incorrect stdio ctors on Unix
* Split Stdio into three separate types: Stdin, Stdout, Stderr
* Rename PendingEntry::File to PendingEntry::OsHandle to avoid confusion
* Rename OsHandle to RawOsHandle
Also, since `RawOsHandle` on *nix doesn't need interior mutability
wrt the inner raw file descriptor, we can safely swap the `RawFd`
for `File` instance.
* Add docs explaining what OsOther is
* Allow for stdio to be non-character-device (e.g., piped)
* Return error on bad preopen rather than panic
* Make Handle a trait required for any WASI-compatible handle
OK, so this PR is a bit of an experiment that came about somewhat itself
when I was looking at refactoring use of `Rc<RefCell<Descriptor>>` inside
`Entry` struct. I've noticed that since we've placed `VirtualFile` on the
same level as `OsHandle` and `Stdin` etc., we've ended up necessiitating
checks for different combinations such as "is a real OS resource being mixed
up with a virtual resource?", and if that was the case, we'd panic since
this was clearly not allowed (e.g., symlinking, or worse renaming).
Therefore, it seemed natural for virtual file to be on the same level
as _any_ OS handle (regardless of whether it's an actual file, socket,
or stdio handle). In other words, we should ideally envision the following
hierarchy:
```
\-- OsHandle \-- OsFile
-- Stdio
\-- Virtual
```
This way, we can deal with the mix up at a level above which cleans up
our logic significantly.
On the other hand, when looking through the `virtfs`, the trait approach
to some type that's a valid `Handle` grew on me, and I think this
is the way to go. And this is what this PR is proposing, a trait
`Handle` which features enough functionality to make both virtual and
OS ops to work. Now, inside `Entry` we can safely store something like
`Rc<dyn Handle>` where `Handle` can downcast to either `VirtualFile` or
`VirtualDir`, or `OsHandle` if its an actual OS resource. Note that
I've left `Handle` as one massive trait, but I reckon we could split
it up into several smaller traits, each dealing with some bit of WASI
functionality. I'm hoping this would perhaps make it easier to figure
out polyfilling between snapshots and the new upcoming ephemeral
snapshot since a lot of boilerplate functionality is now done as part
of the `Handle` trait implementation.
Next, I've redone the original `OsHandle` to be an `OsFile` which
now stores a raw descriptor/handle (`RawFd`/`RawHandle`) inside a
`Cell` so that we can handle interior mutability in an easy (read,
non-panicky) way. In order not to lose the perks of derefercing to
`std::fs::File`, I've added a convenience trait `AsFile` which
will take `OsFile` by reference (or the stdio handles) and create
a non-owned `ManuallyDrop<File>` resource which can be passed around
and acted upon the way we'd normally do on `&File`. This change of
course implies that we now have to worry about properly closing all
OS resources stored as part of `OsFile`, thus this type now implements
`Drop` trait which essentially speaking moves the raw descriptor/handle
into a `File` and drops it.
Finally, I've redone setting time info on relative paths on *nix using
the same approach as advocated in the virtual fs. Namely, we do an
`openat` followed by `filestat_set_times` on the obtained descriptor.
This effectively removes the need for custom `filetime` module in
`yanix`. However, this does probably incur additional cost of at least
one additional syscall, and I haven't checked whether this approach
performs as expected on platforms such as NixOS which as far as I remember
had some weirdness todo with linking `utimensat` symbols, etc. Still,
this change is worth considering given that the implementation of
`path_filestat_set_times` cleans up a lot, albeit with some additional
cost.
* Fix tests on Windows
* Address comments plus minor consistency cleanup
* Address comments
* Fix formatting
* Use wiggle in place of wig in wasi-common
This is a rather massive commit that introduces `wiggle` into the
picture. We still use `wig`'s macro in `old` snapshot and to generate
`wasmtime-wasi` glue, but everything else is now autogenerated by `wiggle`.
In summary, thanks to `wiggle`, we no longer need to worry about
serialising and deserialising to and from the guest memory, and
all guest (WASI) types are now proper idiomatic Rust types.
While we're here, in preparation for the ephemeral snapshot, I went
ahead and reorganised the internal structure of the crate. Instead of
modules like `hostcalls_impl` or `hostcalls_impl::fs`, the structure
now resembles that in ephemeral with modules like `path`, `fd`, etc.
Now, I'm not requiring we leave it like this, but I reckon it looks
cleaner this way after all.
* Fix wig to use new first-class access to caller's mem
* Ignore warning in proc_exit for the moment
* Group unsafes together in args and environ calls
* Simplify pwrite; more unsafe blocks
* Simplify fd_read
* Bundle up unsafes in fd_readdir
* Simplify fd_write
* Add comment to path_readlink re zero-len buffers
* Simplify unsafes in random_get
* Hide GuestPtr<str> to &str in path::get
* Rewrite pread and pwrite using SeekFrom and read/write_vectored
I've left the implementation of VirtualFs pretty much untouched
as I don't feel that comfortable in changing the API too much.
Having said that, I reckon `pread` and `pwrite` could be refactored
out, and `preadv` and `pwritev` could be entirely rewritten using
`seek` and `read_vectored` and `write_vectored`.
* Add comment about VirtFs unsafety
* Fix all mentions of FdEntry to Entry
* Fix warnings on Win
* Add aux struct EntryTable responsible for Fds and Entries
This commit adds aux struct `EntryTable` which is private to `WasiCtx`
and is basically responsible for `Fd` alloc/dealloc as well as storing
matching `Entry`s. This struct is entirely private to `WasiCtx` and
as such as should remain transparent to `WasiCtx` users.
* Remove redundant check for empty buffer in path_readlink
* Preserve and rewind file cursor in pread/pwrite
* Use GuestPtr<[u8]>::copy_from_slice wherever copying bytes directly
* Use GuestPtr<[u8]>::copy_from_slice in fd_readdir
* Clean up unsafes around WasiCtx accessors
* Fix bugs in args_get and environ_get
* Fix conflicts after rebase
* Rename FdEntry to Entry
* Add custom FdSet container for managing fd allocs/deallocs
This commit adds a custom `FdSet` container which is intended
for use in `wasi-common` to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handles
```rust
fd = fd.checked_add(1).ok_or(...)?;
```
and to make it possible to reuse unused/reclaimed handles
which currently is not done.
The struct offers 3 methods to manage its functionality:
* `FdSet::new` initialises the internal data structures,
and most notably, it preallocates an `FdSet::BATCH_SIZE`
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describing `allocate` method.)
* `FdSet::allocate` pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the next `FdSet::BATCH_SIZE` worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then return `None` for the client
to handle in some way (e.g., throwing an error such as `WasiError::EMFILE`
or whatnot).
* `FdSet::deallocate` returns the already allocated handle back to
the pool for further reuse.
When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
`O(1)~*` performance for both (if my maths is right, and it may very
well not be, so please verify!).
In order to keep `FdSet` fairly generic, I've made sure not to hard-code
it for the current type system generated by `wig` (i.e., `wasi::__wasi_fd_t`
representing WASI handle), but rather, any type which wants to be managed
by `FdSet` needs to conform to `Fd` trait. This trait is quite simple as
it only requires a couple of rudimentary traits (although `std:#️⃣:Hash`
is quite a powerful assumption here!), and a custom method
```rust
Fd::next(&self) -> Option<Self>;
```
which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simply `u32::checked_add(1)`.
When `wiggle` makes it way into the `wasi-common`, I'd imagine it being
similar to
```rust
fn next(&self) -> Option<Self> {
self.0.checked_add(1).map(Self::from)
}
```
Anyhow, I'd be happy to learn your thoughts about this design!
* Fix compilation on other targets
* Rename FdSet to FdPool
* Fix FdPool unit tests
* Skip preallocation step in FdPool
* Replace 'replace' calls with direct assignment
* Reuse FdPool from snapshot1 in snapshot0
* Refactor FdPool::allocate
* Remove entry before deallocating the fd
* Refactor the design to accommodate `u32` as underlying type
This commit refactors the design by ensuring that the underlying
type in `FdPool` which we use to track and represent raw file
descriptors is `u32`. As a result, the structure of `FdPool` is
simplified massively as we no longer need to track the claimed
descriptors; in a way, we trust the caller to return the handle
after it's done with it. In case the caller decides to be clever
and return a handle which was not yet legally allocated, we panic.
This should never be a problem in `wasi-common` unless we hit a
bug.
To make all of this work, `Fd` trait is modified to require two
methods: `as_raw(&self) -> u32` and `from_raw(raw_fd: u32) -> Self`
both of which are used to convert to and from the `FdPool`'s underlying
type `u32`.
* Introduce WasiCtxBuilderError error type
`WasiCtxBuilderError` is the `wasi-common` client-facing error type
which is exclusively thrown when building a new `WasiCtx` instance.
As such, building such an instance should not require the client to
understand different WASI errno values as was assumed until now.
This commit is a first step at streamlining error handling in
`wasi-common` and makes way for the `wiggle` crate.
When adding the `WasiCtxBuilderError`, I've had to do two things of
notable importance:
1. I've removed a couple of `ok_or` calls in `WasiCtxBuilder::build`
and replaced them with `unwrap`s, following the same pattern in
different builder methods above. This is fine since we _always_
operate on non-empty `Option`s in `WasiCtxBuilder` thus `unwrap`ing
will never fail. On the other hand, this might be a good opportunity
to rethink the structure of our builder, and how we good remove
the said `Option`s especially since we always populate them with
empty containers to begin with. I understand this is to make
chaining of builder methods easier which take and return `&mut self`
and the same applies to `WasiCtxBuilder::build(&mut self)` method,
but perhaps it would more cleanly signal the intentions if we simply
moved `WasiCtxBuilder` instance around. Food for thought!
2. Methods specific to determining rights of passed around `std::fs::File`
objects when populating `WasiCtx` `FdEntry` entities now return
`io::Error` directly so that we can reuse them in `WasiCtxBuilder` methods
(returning `WasiCtxBuilderError` error type), and in syscalls
(returning WASI errno).
* Return WasiError directly in syscalls
Also, removes `error::Error` type altogether. Now, `io::Error` and
related are automatically converted to their corresponding WASI
errno value encapsulated as `WasiError`.
While here, it made sense to me to move `WasiError` to `wasi` module
which will align itself well with the upcoming changes introduced
by `wiggle`. To different standard `Result` from WASI specific, I've
created a helper alias `WasiResult` also residing in `wasi` module.
* Update wig
* Add from ffi::NulError and pass context to NotADirectory
* Add dummy commit to test CI
* Move filetime module to yanix
I've noticed that we could replace every occurrence of `crate::Result`
in `filetime` mods with `io::Result`, so I thought why not move it
to `yanix` and get rid off a lot of unnecessary code duplication
within `wasi-common`. Now, ideally I'd have our `filetime` modifications
backported to Alex's [`filetime`] crate, but one step at a time
(apologies Alex, I was meant to backport this ages ago, just didn't
find the time yet... :-().
Anyway, this commit does just that; i.e., moves the `filetime` modules
into `yanix` which seems a better fit for this type of code.
[`filetime`]: https://github.com/alexcrichton/filetime
There is one caveat here. On Emscripten, converting between `filetime::Filetime`
and `libc::timespec` appears to be lossy, at least as far as the
types are concerned. Now, `filetime::Filetime`'s seconds field is
`i64` while nanoseconds field is `u32`, while Emscripten's
`libc::timespec` requires both to be `i32` width. This might actually
not be a problem since I don't think it's possible to fill `filetime::Filetime`
struct with values of width wider than `i32` since Emscripten is 32bit
but just to be on the safe side, we do a `TryInto` conversion, log
the error (if any), and return `libc::EOVERFLOW`.
* Run cargo fmt
* Use i64::from instead of as cast
* Compile wasi-common to Emscripten
This commit enables cross-compiling of `wasi-common` to Emscripten. To achieve
this, this commit does quite a bit reshuffling in the existing codebase. Namely,
* rename `linux` modules in `wasi-common` and `yanix` to `linux_like` -- this is
needed so that we can separate out logic specific to Linux and Emscripten out
* tweak `dir` module in `yanix` to support Emscripten -- in particular, the main
change involves `SeekLoc::from_raw` which has to be now host-specific, and is now
fallible
* tweak `filetime` so that in Emscripten we never check for existence of `utimensat`
at runtime since we are guaranteed for it to exist by design
* since `utimes` and `futimes` are not present in Emscripten, move them into a separate
module, `utimesat`, and tag it cfg-non-emscripten only
* finally, `to_timespec` is now fallible since on Emscripten we have to cast number of
seconds, `FileTime::seconds` from `i64` to `libc::c_long` which resolves to `i32`
unlike on other nixes
* Fix macos build
* Verify wasi-common compiles to Emscripten
This commit adds `emscripten` job to Github Actions which installs
`wasm32-unknown-emscripten` target, and builds `wasi-common` crate.
* Use #[path] to cherry-pick mods for Emscripten
This commit effectively reverses the reorg introduced in 145f4a5
in that it ditches `linux_like` mod for separate mods `linux` and
`emscripten` which are now on the same crate level, and instead,
pulls in common bits from `linux` using the `#[path = ..]` proc
macro.
* Add yanix crate
This commit adds `yanix` crate as a Unix dependency for `wasi-common`.
`yanix` stands for Yet Another Nix crate and is exactly what the name
suggests: a crate in the spirit of the `nix` crate, but which takes a different
approach, using lower-level interfaces with less abstraction, so that it fits
better with its main use case, implementation of WASI syscalls.
* Replace nix with yanix crate
Having introduced `yanix` crate as an in-house replacement for the
`nix` crate, this commit makes the necessary changes to `wasi-common`
to depend _only_ on `yanix` crate.
* Address review comments
* make `fd_dup` unsafe
* rename `get_fd` to `get_fd_flags`, etc.
* reuse `io::Error::last_os_error()` to get the last errno value
* Address more comments
* make all `fcntl` fns unsafe
* adjust `wasi-common` impl appropriately
* Make all fns operating on RawFd unsafe
* Fix linux build
* Address more comments
* Dynamically load utimensat if exists on the host
This commit introduces a change to file time management for *nix based
hosts in that it firstly tries to load `utimensat` symbol, and if it
doesn't exist, then falls back to `utimes` instead. This change is
borrowing very heavily from [filetime] crate, however, it introduces a
couple of helpers and methods specific to WASI use case (or more
generally, to a use case which requires modifying times of entities
specified by a pair `(DirFD, RelativePath)` rather than the typical
file time specification based only absolute path or raw file descriptor
as is the case with [filetime] crate. The trick here is, that on kernels
which do not have `utimensat` symbol, this implementation emulates this
behaviour by a combination of `openat` and `utimes`.
This commit also is meant to address #516.
[filetime]: https://github.com/alexcrichton/filetime
* Fix symlink NOFOLLOW flag setting
* Add docs and specify UTIME_NOW/OMIT on Linux
Previously, we relied on [libc] crate for `UTIME_NOW` and `UTIME_OMIT`
constants on Linux. However, following the convention assumed in
[filetime] crate, this is now changed to directly specified by us
in our crate.
[libc]: https://github.com/rust-lang/libc
[filetime]: https://github.com/alexcrichton/filetime
* Refactor UTIME_NOW/OMIT for BSD
* Address final discussion points