* Bump the wasm-tools crates
Pulls in some updates here and there, mostly for updating crates to the
latest version to prepare for later memory64 work.
* Update lightbeam
* Start a high-level architecture document for Wasmtime
This commit cleands up some existing documentation by removing a number
of "noop README files" and starting a high-level overview of the
architecture of Wasmtime. I've placed this documentation under the
contributing section of the book since it seems most useful for possible
contributors.
I've surely left some things out in this pass, and am happy to add more!
* Review comments
* More rewording
* typos
* Update wasm-tools crates
This brings in recent updates, notably including more improvements to
wasm-smith which will hopefully help exercise non-trapping wasm more.
* Fix some wat
This adds benchmarks around module instantiation using criterion.
Both the default (i.e. on-demand) and pooling allocators are tested
sequentially and in parallel using a thread pool.
Instantiation is tested with an empty module, a module with a single page
linear memory, a larger linear memory with a data initializer, and a "hello
world" Rust WASI program.
* Upgrade to the latest versions of gimli, addr2line, object
And adapt to API changes. New gimli supports wasm dwarf, resulting in
some simplifications in the debug crate.
* upgrade gimli usage in linux-specific profiling too
* Add "continue" statement after interpreting a wasm local dwarf opcode
* Make `FunctionInfo` public and `CompiledModule::func_info` return it.
* Make the `StackMapLookup` trait unsafe.
* Add comments for the purpose of `EngineHostFuncs`.
* Rework ownership model of shared signatures: `SignatureCollection` in
conjunction with `SignatureRegistry` is now used so that the `Engine`,
`Store`, and `Module` don't need to worry about unregistering shared
signatures.
* Implement `Func::param_arity` and `Func::result_arity` in terms of
`Func::ty`.
* Make looking up a trampoline with the module registry more efficient by doing
a binary search on the function's starting PC value for the owning module and
then looking up the trampoline with only that module.
* Remove reference to the shared signatures from `GlobalRegisteredModule`.
This commit removes the stack map registry and instead uses the existing
information from the store's module registry to lookup stack maps.
A trait is now used to pass the lookup context to the runtime, implemented by
`Store` to do the lookup.
With this change, module registration in `Store` is now entirely limited to
inserting the module into the module registry.
This commit is intended to be a perf improvement for instantiation of
modules with lots of functions. Previously the `lookup_shared_signature`
callback was showing up quite high in profiles as part of instantiation.
As some background, this callback is used to translate from a module's
`SignatureIndex` to a `VMSharedSignatureIndex` which the instance
stores. This callback is called for two reasons, one is to translate all
of the module's own types into `VMSharedSignatureIndex` for the purposes
of `call_indirect` (the translation of that loads from this table to
compare indices). The second reason is that a `VMCallerCheckedAnyfunc`
is prepared for all functions and this embeds a `VMSharedSignatureIndex`
inside of it.
The slow part today is that the lookup callback was called
once-per-function and each lookup involved hashing a full
`WasmFuncType`. Albeit our hash algorithm is still Rust's default
SipHash algorithm which is quite slow, but we also shouldn't need to
re-hash each signature if we see it multiple times anyway.
The fix applied in this commit is to change this lookup callback to an
`enum` where one variant is that there's a table to lookup from. This
table is a `PrimaryMap` which means that lookup is quite fast. The only
thing we need to do is to prepare the table ahead of time. Currently
this happens on the instantiation path because in my measurments the
creation of the table is quite fast compared to the rest of
instantiation. If this becomes an issue, though, we can look into
creating the table as part of `SigRegistry::register_module` and caching
it somewhere (I'm not entirely sure where but I'm sure we can figure it
out).
There's in generally not a ton of efficiency around the `SigRegistry`
type. I'm hoping though that this fixes the next-lowest-hanging-fruit in
terms of performance without complicating the implementation too much. I
tried a few variants and this change seemed like the best balance
between simplicity and still a nice performance gain.
Locally I measured an improvement in instantiation time for a large-ish
module by reducing the time from ~3ms to ~2.6ms per instance.
* Remove `once-cell` dependency.
* Remove function address `BTreeMap` from `CompiledModule` in favor of binary
searching finished functions directly.
* Use `with_capacity` when populating `CompiledModule` finished functions and
trampolines.
This commit refactors the store frame information to eliminate the copying of
data out from `CompiledModule`.
It also moves the population of a `BTreeMap` out of the frame information and
into `CompiledModule` where it is only ever calculated once rather than at
every new module instantiation into a `Store`. The map is also lazy-initialized
so the cost of populating the map is incurred only when a trap occurs.
This should help improve instantiation time of modules with a large number of
functions and functions with lots of instructions.
* Fully support multiple returns in Wasmtime
For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.
The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.
This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.
These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!
Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.
Closes#1178
* Tweak some names and add docs
* "fix" lightbeam compile
* Fix TODO with dummy environ
* Unwind info is a property of the target, not the ABI
* Remove lightbeam unused imports
* Attempt to fix arm64
* Document new ABIs aren't stable
* Fix filetests to use the right target
* Don't always do 64-bit stores with cranelift
This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.
* At least get tests passing on the old backend
* Fix a typo
* Add some filetests with mixed abi calls
* Get `multi` example working
* Fix doctests on old x86 backend
* Add a mixture of wasmtime/system_v tests
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.
The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.
As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.
CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.
At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
This commit adds a `compile` command to the Wasmtime CLI.
The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.
With the `all-arch` feature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).
The `Module::compile` method has been added to perform AOT compilation.
A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.
A simple example of using the `wasmtime compile` command:
```text
$ wasmtime compile input.wasm
$ wasmtime input.cwasm
```
Previously each module in a module-linking-using-module would compile
all the trampolines for all signatures for all modules. In forest-like
situations with lots of modules this would cause quite a few trampolines
to get compiled. The original intention was to have one global list of
trampolines for all modules in the module-linking graph that they could
all share. With the current design of module linking, however, the
intention is for modules to be relatively isolated from one another
which would make achieving this difficult.
In lieu of total sharing (which would be good for the global scope
anyway but we also don't do that right now) this commit implements an
alternative strategy where each module simply compiles its own
trampolines that it itself can reach. This should mean that
module-linking modules behave more similarly to standalone modules in
terms of trampoline duplication. If we ever do global trampoline
deduplication we can likely batch this all together into one, but for
now this should fix the performance issues seen in fuzzing.
Closes#2525
* Use stable Rust on CI to test the x64 backend
This commit leverages the newly-released 1.51.0 compiler to test the
new backend on Windows and Linux with a stable compiler instead of a
nightly compiler. This isolates the nightly build to just the nightly
documentation generation and fuzzing, both of which rely on nightly for
the best results right now.
* Use updated stable in book build job
* Run rustfmt for new stable
* Silence new warnings for wasi-nn
* Allow some dead code in the x64 backend
Looks like new rustc is better about emitting some dead-code warnings
* Update rust in peepmatic job
* Fix a test in the pooling allocator
* Remove `package.metdata.docs.rs` temporarily
Needs resolution of https://github.com/rust-lang/cargo/pull/9300 first
* Fix a warning in a wasi-nn example
This bumps target-lexicon and adds support for the AppleAarch64 calling
convention. Specifically for WebAssembly support, we only have to worry
about the new stack slots convention. Stack slots don't need to be at
least 8-bytes, they can be as small as the data type's size. For
instance, if we need stack slots for (i32, i32), they can be located at
offsets (+0, +4). Note that they still need to be properly aligned on
the data type they're containing, though, so if we need stack slots for
(i32, i64), we can't start the i64 slot at the +4 offset (it must start
at the +8 offset).
Added one test that was failing on the Mac M1, as well as other tests
stressing different yet similar situations.
* More use of `anyhow`.
* Change `make_accessible` into `protect_linear_memory` to better demonstrate
what it is used for; this will make the uffd implementation make a little
more sense.
* Remove `create_memory_map` in favor of just creating the `Mmap` instances in
the pooling allocator. This also removes the need for `MAP_NORESERVE` in the
uffd implementation.
* Moar comments.
* Remove `BasePointerIterator` in favor of `impl Iterator`.
* The uffd implementation now only monitors linear memory pages and will only
receive faults on pages that could potentially be accessible and never on a
statically known guard page.
* Stop allocating memory or table pools if the maximum limit of the memory or
table is 0.
* Add `anyhow` dependency to `wasmtime-runtime`.
* Revert `get_data` back to `fn`.
* Remove `DataInitializer` and box the data in `Module` translation instead.
* Improve comments on `MemoryInitialization`.
* Remove `MemoryInitialization::OutOfBounds` in favor of proper bulk memory
semantics.
* Use segmented memory initialization except for when the uffd feature is
enabled on Linux.
* Validate modules with the allocator after translation.
* Updated various functions in the runtime to return `anyhow::Result`.
* Use a slice when copying pages instead of `ptr::copy_nonoverlapping`.
* Remove unnecessary casts in `OnDemandAllocator::deallocate`.
* Better document the `uffd` feature.
* Use WebAssembly page-sized pages in the paged initialization.
* Remove the stack pool from the uffd handler and simply protect just the guard
pages.
This commit implements copying paged initialization data upon a fault of a
linear memory page.
If the initialization data is "paged", then the appropriate pages are copied
into the Wasm page (or zeroed if the page is not present in the
initialization data).
If the initialization data is not "paged", the Wasm page is zeroed so that
module instantiation can initialize the pages.
This commit introduces two new methods on `InstanceAllocator`:
* `validate_module` - this method is used to validate a module after
translation but before compilation. It will be used for the upcoming pooling
allocator to ensure a module being compiled adheres to the limits of the
allocator.
* `adjust_tunables` - this method is used to adjust the `Tunables` given the
JIT compiler. The pooling allocator will use this to force all memories to
be static during compilation.
This commit refactors module instantiation in the runtime to allow for
different instance allocation strategy implementations.
It adds an `InstanceAllocator` trait with the current implementation put behind
the `OnDemandInstanceAllocator` struct.
The Wasmtime API has been updated to allow a `Config` to have an instance
allocation strategy set which will determine how instances get allocated.
This change is in preparation for an alternative *pooling* instance allocator
that can reserve all needed host process address space in advance.
This commit also makes changes to the `wasmtime_environ` crate to represent
compiled modules in a way that reduces copying at instantiation time.
* Update wasm-tools crates
* Update Wasm SIMD spec tests
* Invert 'experimental_x64_should_panic' logic
By doing this, it is easier to see which spec tests currently panic. The new tests correspond to recently-added instructions.
* Fix: ignore new spec tests for all backends
With `Module::{serialize,deserialize}` it should be possible to share
wasmtime modules across machines or CPUs. Serialization, however, embeds
a hash of all configuration values, including cranelift compilation
settings. By default wasmtime's selection of the native ISA would enable
ISA flags according to CPU features available on the host, but the same
CPU features may not be available across two machines.
This commit adds a `Config::cranelift_clear_cpu_flags` method which
allows clearing the target-specific ISA flags that are automatically
inferred by default for the native CPU. Options can then be
incrementally built back up as-desired with teh `cranelift_other_flag`
method.
This commit goes through the dependencies that wasmtime has and updates
versions where possible. This notably brings in a wasmparser/wast update
which has some simd spec changes with new instructions. Otherwise most
of these are just routine updates.
This commit fully implements outer aliases of the module linking
proposal. Outer aliases can now handle multiple-level-up aliases and now
properly also handle closed-over-values of modules that are either
imported or defined.
The structure of `wasmtime::Module` was altered as part of this commit.
It is now a compiled module plus two lists of "upvars", or closed over
values used when instantiating the module. One list of upvars is
compiled artifacts which are submodules that could be used. Another is
module values that are injected via outer aliases. Serialization and
such have been updated as appropriate to handle this.
This commit updates the various tooling used by wasmtime which has new
updates to the module linking proposal. This is done primarily to sync
with WebAssembly/module-linking#26. The main change implemented here is
that wasmtime now supports creating instances from a set of values, nott
just from instantiating a module. Additionally subtyping handling of
modules with respect to imports is now properly handled by desugaring
two-level imports to imports of instances.
A number of small refactorings are included here as well, but most of
them are in accordance with the changes to `wasmparser` and the updated
binary format for module linking.
This commit updates all the wasm-tools crates that we use and enables
fuzzing of the module linking proposal in our various fuzz targets. This
also refactors some of the dummy value generation logic to not be
fallible and to always succeed, the thinking being that we don't want to
accidentally hide errors while fuzzing. Additionally instantiation is
only allowed to fail with a `Trap`, other failure reasons are unwrapped.
* Implement imported/exported modules/instances
This commit implements the final piece of the module linking proposal
which is to flesh out the support for importing/exporting instances and
modules. This ended up having a few changes:
* Two more `PrimaryMap` instances are now stored in an `Instance`. The value
for instances is `InstanceHandle` (pretty easy) and for modules it's
`Box<dyn Any>` (less easy).
* The custom host state for `InstanceHandle` for `wasmtime` is now
`Arc<TypeTables` to be able to fully reconstruct an instance's types
just from its instance.
* Type matching for imports now has been updated to take
instances/modules into account.
One of the main downsides of this implementation is that type matching
of imports is duplicated between wasmparser and wasmtime, leading to
posssible bugs especially in the subtelties of module linking. I'm not
sure how best to unify these two pieces of validation, however, and it
may be more trouble than it's worth.
cc #2094
* Update wat/wast/wasmparser
* Review comments
* Fix a bug in publish script to vendor the right witx
Currently there's two witx binaries in our repository given the two wasi
spec submodules, so this updates the publication script to vendor the
right one.