* Delete historical interruptable support in Wasmtime
This commit removes the `Config::interruptable` configuration along with
the `InterruptHandle` type from the `wasmtime` crate. The original
support for adding interruption to WebAssembly was added pretty early on
in the history of Wasmtime when there was no other method to prevent an
infinite loop from the host. Nowadays, however, there are alternative
methods for interruption such as fuel or epoch-based interruption.
One of the major downsides of `Config::interruptable` is that even when
it's not enabled it forces an atomic swap to happen when entering
WebAssembly code. This technically could be a non-atomic swap if the
configuration option isn't enabled but that produces even more branch-y
code on entry into WebAssembly which is already something we try to
optimize. Calling into WebAssembly is on the order of a dozens of
nanoseconds at this time and an atomic swap, even uncontended, can add
up to 5ns on some platforms.
The main goal of this PR is to remove this atomic swap on entry into
WebAssembly. This is done by removing the `Config::interruptable` field
entirely, moving all existing consumers to epochs instead which are
suitable for the same purposes. This means that the stack overflow check
is no longer entangled with the interruption check and perhaps one day
we could continue to optimize that further as well.
Some consequences of this change are:
* Epochs are now the only method of remote-thread interruption.
* There are no more Wasmtime traps that produces the `Interrupted` trap
code, although we may wish to move future traps to this so I left it
in place.
* The C API support for interrupt handles was also removed and bindings
for epoch methods were added.
* Function-entry checks for interruption are a tiny bit less efficient
since one check is performed for the stack limit and a second is
performed for the epoch as opposed to the `Config::interruptable`
style of bundling the stack limit and the interrupt check in one. It's
expected though that this is likely to not really be measurable.
* The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.
When using the pooling instance allocator for running spec tests we have
to make sure that the configured limits of the allocator aren't so low
as to cause the spec tests to fail due to resource exhaustion issues or
similar. This commit adds in minimum thresholds for instance size as
well as instance count. While here this goes ahead and refactors
everything here to look similar.
This commit removes the currently existing laziness-via-`OnceCell` when
a `Module` is created for creating a `ModuleMemoryImages` data
structure. Processing of data is now already shifted to compile time for
the wasm module which means that creating a `ModuleMemoryImages` is
either cheap because the module is backed by a file on disk, it's a
single `write` into the kernel to a memfd, or it's cheap as it's not
supported. This should help make module instantiation time more
deterministic, even for the first instantiation of a module.
Currently, the use of the downcast method means that you have to use one
of the hard-coded types. But Enarx needs to define its own `WasiFile`
implementations. This works fine, except the resulting files cannot be
used in poll because they aren't part of the hard-coded list.
Replace this with an accessor method for the pollable type in
`WasiFile`. Because we provide a default implementation of the method
and manually implement it on all the hard-coded types, this is backwards
compatible.
Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>
1. This makes it easier for implementors to deal with internal APIs.
2. This matches the signatures of the WASI Snapshot traits.
Although it is likely true that these methods would have to become
immutable in order to implement threading efficiently, threading will
impact a large number of existing traits. So this change is practical
for now with an already-unavoidable change required for threading.
Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>
Previously (as in an hour ago) #3905 landed a new ability for fuzzing to
arbitrarily insert padding between functions. Running some fuzzers
locally though this instantly hit a lot of problems on AArch64 because
the arbitrary padding isn't aligned to 4 bytes like all other functions
are. To fix this issue appending functions now correctly aligns the
output as appropriate for the platform. The alignment argument for
appending was switched to `None` where `None` means "use the platform
default" and otherwise and explicit alignment can be specified for
inserting other data (like arbitrary padding or Windows unwind tables).
In #3849, I moved uextend over to ISLE in the x64 backend. Unfortunately, the lowering patterns had a bug in the i32-to-i64 special case (when we know the generating instruction zeroes the upper 32 bits): it wasn't actually special casing for an i32 source! This meant that e.g. zero extends of the results of i8 adds did not work properly.
This PR fixes the bug and updates the runtest for extends significantly to cover the narrow-value cases.
No security impact to Wasm as Wasm does not use narrow integer types.
Thanks @bjorn3 for reporting!
* fuzz: Fuzz padding between compiled functions
This commit hooks up the custom
`wasmtime_linkopt_padding_between_functions` configuration option to the
cranelift compiler into the fuzz configuration, enabling us to ensure
that randomly inserting a moderate amount of padding between functions
shouldn't tamper with any results.
* fuzz: Fuzz the `Config::generate_address_map` option
This commit adds fuzz configuration where `generate_address_map` is
either enabled or disabled, unlike how it's always enabled for fuzzing
today.
* Remove unnecessary handling of relocations
This commit removes a number of bits and pieces all related to handling
relocations in JIT code generated by Wasmtime. None of this is necessary
nowadays that the "old backend" has been removed (quite some time ago)
and relocations are no longer expected to be in the JIT code at all.
Additionally with the minimum x86_64 features required to run wasm code
it should be expected that no libcalls are required either for
Wasmtime-based JIT code.
A recent fuzz failure showed that at least `call.wast` requires a memory
larger than 10 pages, so increase the minimum number of pages that can
be used for executing spec tests.
A fuzz bug was hit last night where the root of the fuzz bug appears to
be exhaustion of the virtual address space. The specific case in
question instantiated a module with ~100 memories ~100 times, and each
memory reserved ~8gb of the virtual address space. This takes around 47
bits of addressable memory which is mighty close to the limit of what
can be done on x86_64, so this commit reduces the number of memories
that an instance may have when coming out of `wasm-smith`.
* Implement runtime checks for compilation settings
This commit fills out a few FIXME annotations by implementing run-time
checks that when a `Module` is created it has compatible codegen
settings for the current host (as `Module` is proof of "this code can
run"). This is done by implementing new `Engine`-level methods which
validate compiler settings. These settings are validated on
`Module::new` as well as when loading serialized modules.
Settings are split into two categories, one for "shared" top-level
settings and one for ISA-specific settings. Both categories now have
allow-lists hardcoded into `Engine` which indicate the acceptable values
for each setting (if applicable). ISA-specific settings are checked with
the Rust standard library's `std::is_x86_feature_detected!` macro. Other
macros for other platforms are not stable at this time but can be added
here if necessary.
Closes#3897
* Fix fall-through logic to actually be correct
* Use a `OnceCell`, not an `AtomicBool`
* Fix some broken tests
In some use cases it is desirable to provide a custom snapshot WASI
context. Facilitate this by depending on a combination of traits
required rather than concrete type in the signature.
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
After adding the `call`-oriented benchmark recently I just noticed that
running benchmarks on CI is taking 30+ minutes which is not intended.
Instead of running a full benchmark run on CI (which I believe we're not
looking at anyway) instead only run the benchmarks for a single
iteration to ensure they still work but otherwise don't collect
statistics about them.
Additionally cap the number of parallel instantiations to 16 to avoid
running tons of tests for machines with lots of cpus.
This avoids a situation such as for the last point release where we had
different point releases all use the same branch name which forced us
make one release at a time. By putting the version number in the branch
name that should hopefully fix this.
If either end stack overflows we can't validate the other side since the
other side, depending on codegen settings, may have been successful, hit
a different trap, or also stack overflowed.
The goal of this new benchmark, `call`, is to help us measure the
overhead of both calling into WebAssembly from the host as well as
calling the host from WebAssembly. There's lots of various ways to
measure this so this benchmark is a bit large but should hopefully be
pretty thorough. It's expected that this benchmark will rarely be run in
its entirety but rather only a subset of the benchmarks will be run at
any one given time.
Some metrics measured here are:
* Typed vs Untyped vs Unchecked - testing the cost of both calling wasm
with these various methods as well as having wasm call the host where
the host function is defined with these various methods.
* With and without `call_hook` - helps to measure the overhead of the
`Store::call_hook` API.
* Synchronous and Asynchronous - measures the overhead of calling into
WebAssembly asynchronously (with and without the pooling allocator) in
addition to defining host APIs in various methods when wasm is called
asynchronously.
Currently all the numbers are as expected, notably:
* Host calling WebAssembly is ~25ns of overhead
* WebAssembly calling the host is ~3ns of overhead
* "Unchecked" is a bit slower than "typed", and "Untyped" is slower than
unchecked.
* Asynchronous wasm calling a synchronous host function has ~3ns of
overhead (nothing more than usual).
* Asynchronous calls are much slower, on the order of 2-3us due to
`madvise`.
Lots of other fiddly bits that can be measured here, but this will
hopefully help establish a benchmark through which we can measure in the
future in addition to measuring changes such as #3876
This commit fixes calling the call hooks configured in a store for host
functions defined with `Func::new_unchecked` or similar. I believe that
this was just an accidental oversight and there's no fundamental reason
to not support this.
Another instance similar to #3879 where when doing differential tests
the pooling allocator configuration needs to be updated to allow for a
possible table.
* Move spec interpreter fuzzing behind a Cargo feature
Building the spec interpreter requires a local installation of Ocaml and
now libgmp which isn't always available, so this enables the ability to
disable building the spec interpreter by using `cargo +nightly fuzz
build --no-default-features`. The spec interpreter is still built by
default but if fuzzers are being built locally and the spec interpreter
isn't needed then this should enable it to be relatively easily
opted-out of.
* Tweak manifest directives
This commit removes some `.unwrap()` annotations around casts between
integers to either be infallible or handle errors. This fixes a panic in
a fuzz test case that popped out for memory64-using modules. The actual
issue here is pretty benign, we were just too eager about assuming
things fit into 32-bit.
The recent removal of `ModuleLimits` meant that the update to the
fuzzers could quickly fail where the instance size limit was set to
something small (like 0) and then nothing would succeed in compilation.
This allows the modules to fail to compile and then continues to the
next fuzz input in these situations.
I frequently notice that the fuzz build of `cranelift-codegen` takes an
extremely long time and recently realized that one issue is that when
fuzzers are built we enable all of the backends in `cranelift-codegen`
but AFAIK only the native backend is actually fuzzed. I traced the
inclusion of `all-arch` back to #2323, specifically [this comment][1]
and it looks like now that the old backend is removed this should be
able to be removed as well.
[1]: https://github.com/bytecodealliance/wasmtime/pull/2323#discussion_r515228552
This seems to have intended to allow overrides but the specific Makefile
syntax used didn't actually allow overrides, so update that to allow env
vars from the outside world to override the variable (needed locally on
AArch64 I'm building on which has a different path to libgmp)
Spec tests require multi-value to be enabled and wasm-smith recently
made this a fuzz-input option, so override the fuzz input as we do for
other features and force-enable multi-value.
This commit updates the build script which clones the spec interpreter
for fuzzing to specifically pin at a hardcoded revision. This is
intended at improving reproducibility if we hit any issues while fuzzing
to ensure that the same wasmtime revision is always using the same spec
interpreter revision.