Commit Graph

14 Commits

Author SHA1 Message Date
Jakub Konka
1f6890e070 [wasi-common] Clean up the use of mutable Entry (#1395)
* Clean up the use of mutable Entry

Until now, several syscalls including `fd_pwrite` etc. were relying on
mutating `&mut Entry` by mutating its inner file handle. This is
unnecessary in almost all cases since all methods mutating `std::fs::File`
in Rust's libstd are also implemented for `&std::fs::File`.

While here, I've also modified `OsHandle` in BSD to include `RefCell<Option<Dir>>`
rather than `Option<Mutex<Dir>>` as was until now. While `RefCell`
could easily be replaced with `RefCell`, since going multithreading
will require a lot of (probably even) conceptual changes to `wasi-common`,
I thought it'd be best not to mix single- with multithreading contexts
and swap all places at once when it comes to it.

I've also had to make some modifications to virtual FS which mainly
swapped mutability for interior mutability in places.

* Use one-liners wherever convenient
2020-03-25 14:00:52 +01:00
Jakub Konka
32595faba5 It's wiggle time! (#1202)
* Use wiggle in place of wig in wasi-common

This is a rather massive commit that introduces `wiggle` into the
picture. We still use `wig`'s macro in `old` snapshot and to generate
`wasmtime-wasi` glue, but everything else is now autogenerated by `wiggle`.
In summary, thanks to `wiggle`, we no longer need to worry about
serialising and deserialising to and from the guest memory, and
all guest (WASI) types are now proper idiomatic Rust types.

While we're here, in preparation for the ephemeral snapshot, I went
ahead and reorganised the internal structure of the crate. Instead of
modules like `hostcalls_impl` or `hostcalls_impl::fs`, the structure
now resembles that in ephemeral with modules like `path`, `fd`, etc.
Now, I'm not requiring we leave it like this, but I reckon it looks
cleaner this way after all.

* Fix wig to use new first-class access to caller's mem

* Ignore warning in proc_exit for the moment

* Group unsafes together in args and environ calls

* Simplify pwrite; more unsafe blocks

* Simplify fd_read

* Bundle up unsafes in fd_readdir

* Simplify fd_write

* Add comment to path_readlink re zero-len buffers

* Simplify unsafes in random_get

* Hide GuestPtr<str> to &str in path::get

* Rewrite pread and pwrite using SeekFrom and read/write_vectored

I've left the implementation of VirtualFs pretty much untouched
as I don't feel that comfortable in changing the API too much.
Having said that, I reckon `pread` and `pwrite` could be refactored
out, and `preadv` and `pwritev` could be entirely rewritten using
`seek` and `read_vectored` and `write_vectored`.

* Add comment about VirtFs unsafety

* Fix all mentions of FdEntry to Entry

* Fix warnings on Win

* Add aux struct EntryTable responsible for Fds and Entries

This commit adds aux struct `EntryTable` which is private to `WasiCtx`
and is basically responsible for `Fd` alloc/dealloc as well as storing
matching `Entry`s. This struct is entirely private to `WasiCtx` and
as such as should remain transparent to `WasiCtx` users.

* Remove redundant check for empty buffer in path_readlink

* Preserve and rewind file cursor in pread/pwrite

* Use GuestPtr<[u8]>::copy_from_slice wherever copying bytes directly

* Use GuestPtr<[u8]>::copy_from_slice in fd_readdir

* Clean up unsafes around WasiCtx accessors

* Fix bugs in args_get and environ_get

* Fix conflicts after rebase
2020-03-20 21:54:44 +01:00
Jakub Konka
7228a248c1 [wasi-common] add custom FdPool container for managing fd allocs/deallocs (#1329)
* Rename FdEntry to Entry

* Add custom FdSet container for managing fd allocs/deallocs

This commit adds a custom `FdSet` container which is intended
for use in `wasi-common` to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handles

```rust
fd = fd.checked_add(1).ok_or(...)?;
```

and to make it possible to reuse unused/reclaimed handles
which currently is not done.

The struct offers 3 methods to manage its functionality:
* `FdSet::new` initialises the internal data structures,
  and most notably, it preallocates an `FdSet::BATCH_SIZE`
  worth of handles in such a way that we always start popping
  from the "smallest" handle (think of it as of reversed stack,
  I guess; it's not a binary heap since we don't really care
  whether internally the handles are sorted in some way, just that
  the "largets" handle is at the bottom. Why will become clear
  when describing `allocate` method.)
* `FdSet::allocate` pops the next available handle if one is available.
  The tricky bit here is that, if we run out of handles, we preallocate
  the next `FdSet::BATCH_SIZE` worth of handles starting from the
  latest popped handle (i.e., the "largest" handle). This
  works only because we make sure to only ever pop and push already
  existing handles from the back, and push _new_ handles (from the
  preallocation step) from the front. When we ultimately run out
  of _all_ available handles, we then return `None` for the client
  to handle in some way (e.g., throwing an error such as `WasiError::EMFILE`
  or whatnot).
* `FdSet::deallocate` returns the already allocated handle back to
  the pool for further reuse.

When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
`O(1)~*` performance for both (if my maths is right, and it may very
well not be, so please verify!).

In order to keep `FdSet` fairly generic, I've made sure not to hard-code
it for the current type system generated by `wig` (i.e., `wasi::__wasi_fd_t`
representing WASI handle), but rather, any type which wants to be managed
by `FdSet` needs to conform to `Fd` trait. This trait is quite simple as
it only requires a couple of rudimentary traits (although `std:#️⃣:Hash`
is quite a powerful assumption here!), and a custom method

```rust
Fd::next(&self) -> Option<Self>;
```

which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simply `u32::checked_add(1)`.
When `wiggle` makes it way into the `wasi-common`, I'd imagine it being
similar to

```rust
fn next(&self) -> Option<Self> {
    self.0.checked_add(1).map(Self::from)
}
```

Anyhow, I'd be happy to learn your thoughts about this design!

* Fix compilation on other targets

* Rename FdSet to FdPool

* Fix FdPool unit tests

* Skip preallocation step in FdPool

* Replace 'replace' calls with direct assignment

* Reuse FdPool from snapshot1 in snapshot0

* Refactor FdPool::allocate

* Remove entry before deallocating the fd

* Refactor the design to accommodate `u32` as underlying type

This commit refactors the design by ensuring that the underlying
type in `FdPool` which we use to track and represent raw file
descriptors is `u32`. As a result, the structure of `FdPool` is
simplified massively as we no longer need to track the claimed
descriptors; in a way, we trust the caller to return the handle
after it's done with it. In case the caller decides to be clever
and return a handle which was not yet legally allocated, we panic.
This should never be a problem in `wasi-common` unless we hit a
bug.

To make all of this work, `Fd` trait is modified to require two
methods: `as_raw(&self) -> u32` and `from_raw(raw_fd: u32) -> Self`
both of which are used to convert to and from the `FdPool`'s underlying
type `u32`.
2020-03-17 22:58:49 +01:00
Jakub Konka
773915b4bf [wasi-common]: clean up error handling (#1253)
* Introduce WasiCtxBuilderError error type

`WasiCtxBuilderError` is the `wasi-common` client-facing error type
which is exclusively thrown when building a new `WasiCtx` instance.
As such, building such an instance should not require the client to
understand different WASI errno values as was assumed until now.

This commit is a first step at streamlining error handling in
`wasi-common` and makes way for the `wiggle` crate.

When adding the `WasiCtxBuilderError`, I've had to do two things of
notable importance:
1. I've removed a couple of `ok_or` calls in `WasiCtxBuilder::build`
   and replaced them with `unwrap`s, following the same pattern in
   different builder methods above. This is fine since we _always_
   operate on non-empty `Option`s in `WasiCtxBuilder` thus `unwrap`ing
   will never fail. On the other hand, this might be a good opportunity
   to rethink the structure of our builder, and how we good remove
   the said `Option`s especially since we always populate them with
   empty containers to begin with. I understand this is to make
   chaining of builder methods easier which take and return `&mut self`
   and the same applies to `WasiCtxBuilder::build(&mut self)` method,
   but perhaps it would more cleanly signal the intentions if we simply
   moved `WasiCtxBuilder` instance around. Food for thought!
2. Methods specific to determining rights of passed around `std::fs::File`
   objects when populating `WasiCtx` `FdEntry` entities now return
   `io::Error` directly so that we can reuse them in `WasiCtxBuilder` methods
   (returning `WasiCtxBuilderError` error type), and in syscalls
   (returning WASI errno).

* Return WasiError directly in syscalls

Also, removes `error::Error` type altogether. Now, `io::Error` and
related are automatically converted to their corresponding WASI
errno value encapsulated as `WasiError`.

While here, it made sense to me to move `WasiError` to `wasi` module
which will align itself well with the upcoming changes introduced
by `wiggle`. To different standard `Result` from WASI specific, I've
created a helper alias `WasiResult` also residing in `wasi` module.

* Update wig

* Add from ffi::NulError and pass context to NotADirectory

* Add dummy commit to test CI
2020-03-09 22:58:55 +01:00
Jakub Konka
061390ee1b [wasi-common]: move filetime module to yanix (#1255)
* Move filetime module to yanix

I've noticed that we could replace every occurrence of `crate::Result`
in `filetime` mods with `io::Result`, so I thought why not move it
to `yanix` and get rid off a lot of unnecessary code duplication
within `wasi-common`. Now, ideally I'd have our `filetime` modifications
backported to Alex's [`filetime`] crate, but one step at a time
(apologies Alex, I was meant to backport this ages ago, just didn't
find the time yet... :-().

Anyway, this commit does just that; i.e., moves the `filetime` modules
into `yanix` which seems a better fit for this type of code.

[`filetime`]: https://github.com/alexcrichton/filetime

There is one caveat here. On Emscripten, converting between `filetime::Filetime`
and `libc::timespec` appears to be lossy, at least as far as the
types are concerned. Now, `filetime::Filetime`'s seconds field is
`i64` while nanoseconds field is `u32`, while Emscripten's
`libc::timespec` requires both to be `i32` width. This might actually
not be a problem since I don't think it's possible to fill `filetime::Filetime`
struct with values of width wider than `i32` since Emscripten is 32bit
but just to be on the safe side, we do a `TryInto` conversion, log
the error (if any), and return `libc::EOVERFLOW`.

* Run cargo fmt

* Use i64::from instead of as cast
2020-03-09 16:07:09 +01:00
iximeow
7e0d9decbf Virtual file support (#701)
* Add support for virtual files (eg, not backed by an OS file).

Virtual files are implemented through trait objects, with a default
implementation that tries to behave like on-disk files, but entirely
backed by in-memory structures.

Co-authored-by: Dan Gohman <sunfish@mozilla.com>
2020-03-06 11:08:13 -08:00
Alex Crichton
054b79427e Fix the path_filestat test on Linux (#706)
Only very recently in #700 did we actually start running wasi tests
again (they weren't running by accident). Just before that landed we
also landed #688 which had some refactorings. Unfortunately #688 had a
minor issue in it which wasn't caught because tests weren't run. This
means that the bug in #688 slipped in and is now being caught by #700
now that both are landed on master.

This commit fixes the small issue introduced and should get our CI green
again!
2019-12-12 15:19:58 -08:00
Jakub Konka
95c2addf15 Compile wasi-common to Emscripten (#688)
* Compile wasi-common to Emscripten

This commit enables cross-compiling of `wasi-common` to Emscripten. To achieve
this, this commit does quite a bit reshuffling in the existing codebase. Namely,
* rename `linux` modules in `wasi-common` and `yanix` to `linux_like` -- this is
  needed so that we can separate out logic specific to Linux and Emscripten out
* tweak `dir` module in `yanix` to support Emscripten -- in particular, the main
  change involves `SeekLoc::from_raw` which has to be now host-specific, and is now
  fallible
* tweak `filetime` so that in Emscripten we never check for existence of `utimensat`
  at runtime since we are guaranteed for it to exist by design
* since `utimes` and `futimes` are not present in Emscripten, move them into a separate
  module, `utimesat`, and tag it cfg-non-emscripten only
* finally, `to_timespec` is now fallible since on Emscripten we have to cast number of
  seconds, `FileTime::seconds` from `i64` to `libc::c_long` which resolves to `i32`
  unlike on other nixes

* Fix macos build

* Verify wasi-common compiles to Emscripten

This commit adds `emscripten` job to Github Actions which installs
`wasm32-unknown-emscripten` target, and builds `wasi-common` crate.

* Use #[path] to cherry-pick mods for Emscripten

This commit effectively reverses the reorg introduced in 145f4a5
in that it ditches `linux_like` mod for separate mods `linux` and
`emscripten` which are now on the same crate level, and instead,
pulls in common bits from `linux` using the `#[path = ..]` proc
macro.
2019-12-11 16:25:13 -08:00
Jakub Konka
51f880f625 Add yanix crate and replace nix with yanix in wasi-common (#649)
* Add yanix crate

This commit adds `yanix` crate as a Unix dependency for `wasi-common`.
`yanix` stands for Yet Another Nix crate and is exactly what the name
suggests: a crate in the spirit of the `nix` crate, but which takes a different
approach, using lower-level interfaces with less abstraction, so that it fits
better with its main use case, implementation of WASI syscalls.

* Replace nix with yanix crate

Having introduced `yanix` crate as an in-house replacement for the
`nix` crate, this commit makes the necessary changes to `wasi-common`
to depend _only_ on `yanix` crate.

* Address review comments

* make `fd_dup` unsafe
* rename `get_fd` to `get_fd_flags`, etc.
* reuse `io::Error::last_os_error()` to get the last errno value

* Address more comments

* make all `fcntl` fns unsafe
* adjust `wasi-common` impl appropriately

* Make all fns operating on RawFd unsafe

* Fix linux build

* Address more comments
2019-12-08 16:40:05 -08:00
Dan Gohman
1f9d764d5d Support fd_fdstat_get and fd_renumber on stdin/stdout/stderr (#631)
* Support fd_fdstat_get on stdin/stdout/stderr.

Add a routine for obtaining an `OsFile` containing a file descriptor for
stdin/stdout/stderr so that we can do fd_fdstat_get on them.

* Add a testcase for fd_fdstat_get etc. on stdin etc.

* Don't dup file descriptors in fd_renumber.

* Fix compilation on macOS

* Rename OsFile to OsHandle

This commits renames `OsFile` to `OsHandle` which seems to make
more sense semantically as it is permitted to hold a valid OS handle
to OS entities other than simply file/dir (e.g., socket, stream, etc.).
As such, this commit also renames methods on `Descriptor` struct
from `as_actual_file` to `as_file` as this in reality does pertain
ops on FS entities such as files/dirs, and `as_file` to `as_os_handle`
as in this case it can be anything, from file, through a socket, to
a stream.

* Fix compilation on Linux

* Introduce `OsHandleRef` for borrowing OS resources.

To prevent a `ManuallyDrop<OsHandleRef>` from outliving the resource it
holds on to, create an `OsHandleRef` class parameterized on the lifetime
of the `Descriptor`.

* Fix scoping to pub-priv and backport to snapshot_0
2019-11-28 14:36:18 +01:00
Jakub Konka
64f9cee842 Fix build errors on nightly
Workaround for a regression in upstream rust-lang/rust.
2019-11-25 23:53:02 +01:00
Jakub Konka
c45f70999a Unify fd_readdir impl between *nixes (#613)
* Unify fd_readdir impl between *nixes

This commit unifies the implementation of `fd_readdir` between Linux
and BSD hosts. In particular, it re-uses the `Dirent`, `Entry`, and
`Dir` (among others) building blocks introduced recently when
`fd_readdir` was being implemented on Windows.

Notable changes:
* on BSD, wraps `readdir` syscall in an `Iterator` of the mutex-locked
  `Dir` struct
* on BSD, removes `DirStream` struct from `OsFile`; `OsFile` now holds a
  mutex to `Dir`
* makes `Dir` iterators implementation specific (Linux has its own,
  and so does BSD)

* Lock mutex once only; explain dir in OsFile

* Add more comments
2019-11-24 10:29:55 +01:00
Jakub Konka
0006a2af95 Dynamically load utimensat if exists on the host (#535)
* Dynamically load utimensat if exists on the host

This commit introduces a change to file time management for *nix based
hosts in that it firstly tries to load `utimensat` symbol, and if it
doesn't exist, then falls back to `utimes` instead. This change is
borrowing very heavily from [filetime] crate, however, it introduces a
couple of helpers and methods specific to WASI use case (or more
generally, to a use case which requires modifying times of entities
specified by a pair `(DirFD, RelativePath)` rather than the typical
file time specification based only absolute path or raw file descriptor
as is the case with [filetime] crate. The trick here is, that on kernels
which do not have `utimensat` symbol, this implementation emulates this
behaviour by a combination of `openat` and `utimes`.

This commit also is meant to address #516.

[filetime]: https://github.com/alexcrichton/filetime

* Fix symlink NOFOLLOW flag setting

* Add docs and specify UTIME_NOW/OMIT on Linux

Previously, we relied on [libc] crate for `UTIME_NOW` and `UTIME_OMIT`
constants on Linux. However, following the convention assumed in
[filetime] crate, this is now changed to directly specified by us
in our crate.

[libc]: https://github.com/rust-lang/libc
[filetime]: https://github.com/alexcrichton/filetime

* Refactor UTIME_NOW/OMIT for BSD

* Address final discussion points
2019-11-11 11:42:28 -06:00
Dan Gohman
22641de629 Initial reorg.
This is largely the same as #305, but updated for the current tree.
2019-11-08 06:35:40 -08:00