I don't think this has happened in awhile but I've run a `cargo update`
as well as trimming some of the duplicate/older dependencies in
`Cargo.lock` by updating some of our immediate dependencies as well.
Currently the runtime needs to acquire the current stack pointer so it
can set a limit for where if the wasm stack goes below that point it
will abort the wasm code. Acquiring the stack pointer is done in a
brittle way right now which involves looking at the address of what we
hope is an on-stack structure. This turns out to not work at all with
ASan as well.
Instead this commit switches to the `psm` crate which is used by the
Rust compiler team for stack manipulation, namely a coarse version of
segmented stacks to avoid stack overflow in the compiler. We don't need
most of the implementation of `psm`, just the `stack_pointer` function,
but it shouldn't be a burden to bring in!
Closes#2344
This commit refactors where trampolines and signature information is
stored within a `Store`, namely moving them from
`wasmtime_runtime::Instance` instead to `Store` itself. The goal here is
to remove an allocation inside of an `Instance` and make them a bit
cheaper to create. Additionally this should open up future possibilities
like not creating duplicate trampolines for signatures already in the
`Store` when using `Func::new`.
Similar to other data structures owned by the `Store` there's no need
for `Instance` to have a strong `Arc` reference, instead it's sufficient
for `Store` to have the owning reference.
This commit adds initial (gated) support for the multi-memory wasm
proposal. This was actually quite easy since almost all of wasmtime
already expected multi-memory to be implemented one day. The only real
substantive change is the `memory.copy` intrinsic changes, which now
accounts for the source/destination memories possibly being different.
According to wasm's spec, nearest must do the following, for NaN inputs:
- when the input is a canonical NaN, return a canonical NaN;
- when the input is a non-canonical NaN, return an arithmetic NaN.
This patch adds checks when the exponent is all ones if the input was a
NaN, and will set the significand's most significant bit in that case.
It works both for canonical inputs (which already had the bit set) and
makes other NaN inputs canonical.
use approach with copysign for handling negative zero
format
refactor for better branch prediction
move copysign back to internal branch
format
fix
use abs instead branches
better comments
switch arms for better branch prediction
Although the string description for `TableOutOfBounds` isn't quite
matching what this error case is, it's a bit more descriptive than
`HeapOutOfBounds` anyway.
This commit removes all import resolution handling from the
`wasmtime-jit` crate, instead moving the logic to the `wasmtime` crate.
Previously `wasmtime-jit` had a generic `Resolver` trait and would do
all the import type matching itself, but with the upcoming
module-linking implementation this is going to get much trickier.
The goal of this commit is to centralize all meaty "preparation" logic
for instantiation into one location, probably the `wasmtime` crate
itself. Instantiation will soon involve recursive instantiation and
management of alias definitions as well. Having everything in one
location, especially with access to `Store` so we can persist
instances for safety, will be quite convenient.
Additionally the `Resolver` trait isn't really necessary any more since
imports are, at the lowest level, provided as a list rather than a map
of some kind. More generic resolution functionality is provided via
`Linker` or user layers on top of `Instance::new` itself. This makes
matching up provided items to expected imports much easier as well.
Overall this is largely just moving code around, but most of the code
in the previous `resolve_imports` phase can be deleted since a lot of it
is handled by surrounding pieces of `wasmtime` as well.
This was added long ago at this point to assist with caching, but
caching has moved to a different level such that this wonky second level
of a `Module` isn't necessary. This commit removes the `ModuleLocal`
type to simplify accessors and generally make it easier to work with.
* wasmtime: Implement `global.{get,set}` for externref globals
We use libcalls to implement these -- unlike `table.{get,set}`, for which we
create inline JIT fast paths -- because no known toolchain actually uses
externref globals.
Part of #929
* wasmtime: Enable `{extern,func}ref` globals in the API
This new fuzz target exercises sequences of `table.get`s, `table.set`s, and
GCs.
It already found a couple bugs:
* Some leaks due to ref count cycles between stores and host-defined functions
closing over those stores.
* If there are no live references for a PC, Cranelift can avoid emiting an
associated stack map. This was running afoul of a debug assertion.
These instructions have fast, inline JIT paths for the common cases, and only
call out to host VM functions for the slow paths. This required some changes to
`cranelift-wasm`'s `FuncEnvironment`: instead of taking a `FuncCursor` to insert
an instruction sequence within the current basic block,
`FuncEnvironment::translate_table_{get,set}` now take a `&mut FunctionBuilder`
so that they can create whole new basic blocks. This is necessary for
implementing GC read/write barriers that involve branching (e.g. checking for
null, or whether a store buffer is at capacity).
Furthermore, it required that the `load`, `load_complex`, and `store`
instructions handle loading and storing through an `r{32,64}` rather than just
`i{32,64}` addresses. This involved making `r{32,64}` types acceptable
instantiations of the `iAddr` type variable, plus a few new instruction
encodings.
Part of #929
Also add configuration to CI to fail doc generation if any links are
broken. Unfortunately we can't blanket deny all warnings in rustdoc
since some are unconditional warnings, but for now this is hopefully
good enough.
Closes#1947
`funcref`s are implemented as `NonNull<VMCallerCheckedAnyfunc>`.
This should be more efficient than using a `VMExternRef` that points at a
`VMCallerCheckedAnyfunc` because it gets rid of an indirection, dynamic
allocation, and some reference counting.
Note that the null function reference is *NOT* a null pointer; it is a
`VMCallerCheckedAnyfunc` that has a null `func_ptr` member.
Part of #929
This commit enables `wasmtime_runtime::Table` to internally hold elements of
either `funcref` (all that is currently supported) or `externref` (newly
introduced in this commit).
This commit updates `Table`'s API, but does NOT generally propagate those
changes outwards all the way through the Wasmtime embedding API. It only does
enough to get everything compiling and the current test suite passing. It is
expected that as we implement more of the reference types spec, we will bubble
these changes out and expose them to the embedding API.
This allows us to detect when stack walking has failed to walk the whole stack,
and we are potentially missing on-stack roots, and therefore it would be unsafe
to do a GC because we could free objects too early, leading to use-after-free.
When we detect this scenario, we skip the GC.
For host VM code, we use plain reference counting, where cloning increments
the reference count, and dropping decrements it. We can avoid many of the
on-stack increment/decrement operations that typically plague the
performance of reference counting via Rust's ownership and borrowing system.
Moving a `VMExternRef` avoids mutating its reference count, and borrowing it
either avoids the reference count increment or delays it until if/when the
`VMExternRef` is cloned.
When passing a `VMExternRef` into compiled Wasm code, we don't want to do
reference count mutations for every compiled `local.{get,set}`, nor for
every function call. Therefore, we use a variation of **deferred reference
counting**, where we only mutate reference counts when storing
`VMExternRef`s somewhere that outlives the activation: into a global or
table. Simultaneously, we over-approximate the set of `VMExternRef`s that
are inside Wasm function activations. Periodically, we walk the stack at GC
safe points, and use stack map information to precisely identify the set of
`VMExternRef`s inside Wasm activations. Then we take the difference between
this precise set and our over-approximation, and decrement the reference
count for each of the `VMExternRef`s that are in our over-approximation but
not in the precise set. Finally, the over-approximation is replaced with the
precise set.
The `VMExternRefActivationsTable` implements the over-approximized set of
`VMExternRef`s referenced by Wasm activations. Calling a Wasm function and
passing it a `VMExternRef` moves the `VMExternRef` into the table, and the
compiled Wasm function logically "borrows" the `VMExternRef` from the
table. Similarly, `global.get` and `table.get` operations clone the gotten
`VMExternRef` into the `VMExternRefActivationsTable` and then "borrow" the
reference out of the table.
When a `VMExternRef` is returned to host code from a Wasm function, the host
increments the reference count (because the reference is logically
"borrowed" from the `VMExternRefActivationsTable` and the reference count
from the table will be dropped at the next GC).
For more general information on deferred reference counting, see *An
Examination of Deferred Reference Counting and Cycle Detection* by Quinane:
https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929Fixes#1804
These libcalls are useful for 32-bit platforms.
On x86_32 in particular, commit 4ec16fa0 added support for legalizing
64-bit shifts through SIMD operations. However, that legalization
requires SIMD to be enabled and SSE 4.1 to be supported, which is not
acceptable as a hard requirement.
* Moves CodeMemory, VMInterrupts and SignatureRegistry from Compiler
* CompiledModule holds CodeMemory and GdbJitImageRegistration
* Store keeps track of its JIT code
* Makes "jit_int.rs" stuff Send+Sync
* Adds the threads example.
If you aren't expecting `VMExternRef`'s pointer-equality semantics, then these
trait implementations can be foot guns. Instead of implementing the trait, make
free functions in the `VMExternRef` namespace. This way, callers have to be a
little more explicit.
This is enough to get an `externref -> externref` identity function
passing.
However, `externref`s that are dropped by compiled Wasm code are (safely)
leaked. Follow up work will leverage cranelift's stack maps to resolve this
issue.
`VMExternRef` is a reference-counted box for any kind of data that is
external and opaque to running Wasm. Sometimes it might hold a Wasmtime
thing, other times it might hold something from a Wasmtime embedder and is
opaque even to us. It is morally equivalent to `Rc<dyn Any>` in Rust, but
additionally always fits in a pointer-sized word. `VMExternRef` is
non-nullable, but `Option<VMExternRef>` is a null pointer.
The one part of `VMExternRef` that can't ever be opaque to us is the
reference count. Even when we don't know what's inside an `VMExternRef`, we
need to be able to manipulate its reference count as we add and remove
references to it. And we need to do this from compiled Wasm code, so it must
be `repr(C)`!
`VMExternRef` itself is just a pointer to an `VMExternData`, which holds the
opaque, boxed value, its reference count, and its vtable pointer.
The `VMExternData` struct is *preceded* by the dynamically-sized value boxed
up and referenced by one or more `VMExternRef`s:
```ignore
,-------------------------------------------------------.
| |
V |
+----------------------------+-----------+-----------+ |
| dynamically-sized value... | ref_count | value_ptr |---'
+----------------------------+-----------+-----------+
| VMExternData |
+-----------------------+
^
+-------------+ |
| VMExternRef |-------------------+
+-------------+ |
|
+-------------+ |
| VMExternRef |-------------------+
+-------------+ |
|
... ===
|
+-------------+ |
| VMExternRef |-------------------'
+-------------+
```
The `value_ptr` member always points backwards to the start of the
dynamically-sized value (which is also the start of the heap allocation for
this value-and-`VMExternData` pair). Because it is a `dyn` pointer, it is
fat, and also points to the value's `Any` vtable.
The boxed value and the `VMExternRef` footer are held a single heap
allocation. The layout described above is used to make satisfying the
value's alignment easy: we just need to ensure that the heap allocation used
to hold everything satisfies its alignment. It also ensures that we don't
need a ton of excess padding between the `VMExternData` and the value for
values with large alignment.
* Reactor support.
This implements the new WASI ABI described here:
https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md
It adds APIs to `Instance` and `Linker` with support for running
WASI programs, and also simplifies the process of instantiating
WASI API modules.
This currently only includes Rust API support.
* Add comments and fix a typo in a comment.
* Fix a rustdoc warning.
* Tidy an unneeded `mut`.
* Factor out instance initialization with `NewInstance`.
This also separates instantiation from initialization in a manner
similar to https://github.com/bytecodealliance/lucet/pull/506.
* Update fuzzing oracles for the API changes.
* Remove `wasi_linker` and clarify that Commands/Reactors aren't connected to WASI.
* Move Command/Reactor semantics into the Linker.
* C API support.
* Fix fuzzer build.
* Update usage syntax from "::" to "=".
* Remove `NewInstance` and `start()`.
* Elaborate on Commands and Reactors and add a spec link.
* Add more comments.
* Fix wat syntax.
* Fix wat.
* Use the `Debug` formatter to format an anyhow::Error.
* Fix wat.