This commit is an effort to reduce the amount of complexity around
managing the size/alignment calculations of types in the canonical ABI.
Previously the logic for the size/alignment of a type was spread out
across a number of locations. While each individual calculation is not
really the most complicated thing in the world having the duplication in
so many places was constantly worrying me.
I've opted in this commit to centralize all of this within the runtime
at least, and now there's only one "duplicate" of this information in
the fuzzing infrastructure which is to some degree less important to
deduplicate. This commit introduces a new `CanonicalAbiInfo` type to
house all abi size/align information for both memory32 and memory64.
This new type is then used pervasively throughout fused adapter
compilation, dynamic `Val` management, and typed functions. This type
was also able to reduce the complexity of the macro-generated code
meaning that even `wasmtime-component-macro` is performing less math
than it was before.
One other major feature of this commit is that this ABI information is
now saved within a `ComponentTypes` structure. This avoids recursive
querying of size/align information frequently and instead effectively
caching it. This was a worry I had for the fused adapter compiler which
frequently sought out size/align information and would recursively
descend each type tree each time. The `fact-valid-module` fuzzer is now
nearly 10x faster in terms of iterations/s which I suspect is due to
this caching.
* Improve the `component_api` fuzzer on a few dimensions
* Update the generated component to use an adapter module. This involves
two core wasm instances communicating with each other to test that
data flows through everything correctly. The intention here is to fuzz
the fused adapter compiler. String encoding options have been plumbed
here to exercise differences in string encodings.
* Use `Cow<'static, ...>` and `static` declarations for each static test
case to try to cut down on rustc codegen time.
* Add `Copy` to derivation of fuzzed enums to make `derive(Clone)`
smaller.
* Use `Store<Box<dyn Any>>` to try to cut down on codegen by
monomorphizing fewer `Store<T>` implementation.
* Add debug logging to print out what's flowing in and what's flowing
out for debugging failures.
* Improve `Debug` representation of dynamic value types to more closely
match their Rust counterparts.
* Fix a variant issue with adapter trampolines
Previously the offset of the payload was calculated as the discriminant
aligned up to the alignment of a singular case, but instead this needs
to be aligned up to the alignment of all cases to ensure all cases start
at the same location.
* Fix a copy/paste error when copying masked integers
A 32-bit load was actually doing a 16-bit load by accident since it was
copied from the 16-bit load-and-mask case.
* Fix f32/i64 conversions in adapter modules
The adapter previously erroneously converted the f32 to f64 and then to
i64, where instead it should go from f32 to i32 to i64.
* Fix zero-sized flags in adapter modules
This commit corrects the size calculation for zero-sized flags in
adapter modules.
cc #4592
* Fix a variant size calculation bug in adapters
This fixes the same issue found with variants during normal host-side
fuzzing earlier where the size of a variant needs to align up the
summation of the discriminant and the maximum case size.
* Implement memory growth in libc bump realloc
Some fuzz-generated test cases are copying lists large enough to exceed
one page of memory so bake in a `memory.grow` to the bump allocator as
well.
* Avoid adapters of exponential size
This commit is an attempt to avoid adapters being exponentially sized
with respect to the type hierarchy of the input. Previously all
adaptation was done inline within each adapter which meant that if
something was structured as `tuple<T, T, T, T, ...>` the translation of
`T` would be inlined N times. For very deeply nested types this can
quickly create an exponentially sized adapter with types of the form:
(type $t0 (list u8))
(type $t1 (tuple $t0 $t0))
(type $t2 (tuple $t1 $t1))
(type $t3 (tuple $t2 $t2))
;; ...
where the translation of `t4` has 8 different copies of translating
`t0`.
This commit changes the translation of types through memory to almost
always go through a helper function. The hope here is that it doesn't
lose too much performance because types already reside in memory.
This can still lead to exponentially sized adapter modules to a lesser
degree where if the translation all happens on the "stack", e.g. via
`variant`s and their flat representation then many copies of one
translation could still be made. For now this commit at least gets the
problem under control for fuzzing where fuzzing doesn't trivially find
type hierarchies that take over a minute to codegen the adapter module.
One of the main tricky parts of this implementation is that when a
function is generated the index that it will be placed at in the final
module is not known at that time. To solve this the encoded form of the
`Call` instruction is saved in a relocation-style format where the
`Call` isn't encoded but instead saved into a different area for
encoding later. When the entire adapter module is encoded to wasm these
pseudo-`Call` instructions are encoded as real instructions at that
time.
* Fix some memory64 issues with string encodings
Introduced just before #4623 I had a few mistakes related to 64-bit
memories and mixing 32/64-bit memories.
* Actually insert into the `translate_mem_funcs` map
This... was the whole point of having the map!
* Assert memory growth succeeds in bump allocator
* Implement strings in adapter modules
This commit is a hefty addition to Wasmtime's support for the component
model. This implements the final remaining type (in the current type
hierarchy) unimplemented in adapter module trampolines: strings. Strings
are the most complicated type to implement in adapter trampolines
because they are highly structured chunks of data in memory (according
to specific encodings). Additionally each lift/lower operation can
choose its own encoding for strings meaning that Wasmtime, the host, may
have to convert between any pairwise ordering of string encodings.
The `CanonicalABI.md` in the component-model repo in general specifies
all the fiddly bits of string encoding so there's not a ton of wiggle
room for Wasmtime to get creative. This PR largely "just" implements
that. The high-level architecture of this implementation is:
* Fused adapters are first identified to determine src/dst string
encodings. This statically fixes what transcoding operation is being
performed.
* The generated adapter will be responsible for managing calls to
`realloc` and performing bounds checks. The adapter itself does not
perform memory copies or validation of string contents, however.
Instead each transcoding operation is modeled as an imported function
into the adapter module. This means that the adapter module
dynamically, during compile time, determines what string transcoders
are needed. Note that an imported transcoder is not only parameterized
over the transcoding operation but additionally which memory is the
source and which is the destination.
* The imported core wasm functions are modeled as a new
`CoreDef::Transcoder` structure. These transcoders end up being small
Cranelift-compiled trampolines. The Cranelift-compiled trampoline will
load the actual base pointer of memory and add it to the relative
pointers passed as function arguments. This trampoline then calls a
transcoder "libcall" which enters Rust-defined functions for actual
transcoding operations.
* Each possible transcoding operation is implemented in Rust with a
unique name and a unique signature depending on the needs of the
transcoder. I've tried to document inline what each transcoder does.
This means that the `Module::translate_string` in adapter modules is by
far the largest translation method. The main reason for this is due to
the management around calling the imported transcoder functions in the
face of validating string pointer/lengths and performing the dance of
`realloc`-vs-transcode at the right time. I've tried to ensure that each
individual case in transcoding is documented well enough to understand
what's going on as well.
Additionally in this PR is a full implementation in the host for the
`latin1+utf16` encoding which means that both lifting and lowering host
strings now works with this encoding.
Currently the implementation of each transcoder function is likely far
from optimal. Where possible I've leaned on the standard library itself
and for latin1-related things I'm leaning on the `encoding_rs` crate. I
initially tried to implement everything with `encoding_rs` but was
unable to uniformly do so easily. For now I settled on trying to get a
known-correct (even in the face of endianness) implementation for all of
these transcoders. If an when performance becomes an issue it should be
possible to implement more optimized versions of each of these
transcoding operations.
Testing this commit has been somewhat difficult and my general plan,
like with the `(list T)` type, is to rely heavily on fuzzing to cover
the various cases here. In this PR though I've added a simple test that
pushes some statically known strings through all the pairs of encodings
between source and destination. I've attempted to pick "interesting"
strings that one way or another stress the various paths in each
transcoding operation to ideally get full branch coverage there.
Additionally a suite of "negative" tests have also been added to ensure
that validity of encoding is actually checked.
* Fix a temporarily commented out case
* Fix wasmtime-runtime tests
* Update deny.toml configuration
* Add `BSD-3-Clause` for the `encoding_rs` crate
* Remove some unused licenses
* Add an exemption for `encoding_rs` for now
* Split up the `translate_string` method
Move out all the closures and package up captured state into smaller
lists of arguments.
* Test out-of-bounds for zero-length strings
This addresses #4307.
For the static API we generate 100 arbitrary test cases at build time, each of
which includes 0-5 parameter types, a result type, and a WAT fragment containing
an imported function and an exported function. The exported function calls the
imported function, which is implemented by the host. At runtime, the fuzz test
selects a test case at random and feeds it zero or more sets of arbitrary
parameters and results, checking that values which flow host-to-guest and
guest-to-host make the transition unchanged.
The fuzz test for the dynamic API follows a similar pattern, the only difference
being that test cases are generated at runtime.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* Add cmake compatibility to c-api
* Add CMake documentation to wasmtime.h
* Add CMake instructions in examples
* Modify CI for CMake support
* Use correct rust in CI
* Trigger build
* Refactor run-examples
* Reintroduce example_to_run in run-examples
* Replace run-examples crate with cmake
* Fix markdown formatting in examples readme
* Fix cmake test quotes
* Build rust wasm before cmake tests
* Pass CTEST_OUTPUT_ON_FAILURE
* Another cmake test
* Handle os differences in cmake test
* Fix bugs in memory and multimemory examples
* implement wasmtime::component::flags! per #4308
This is the last macro needed to complete #4308. It supports generating a Rust
type that represents a `flags` component type, analogous to how the [bitflags
crate](https://crates.io/crates/bitflags) operates.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* wrap `format_flags` output in parens
This ensures we generate non-empty output even when no flags are set. Empty
output for a `Debug` implementation would be confusing.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* unconditionally derive `Lift` and `Lower` in wasmtime::component::flags!
Per feedback on #4414, we now derive impls for those traits unconditionally,
which simplifies the syntax of the macro.
Also, I happened to notice an alignment bug in `LowerExpander::expand_variant`,
so I fixed that and cleaned up some related code.
Finally, I used @jameysharp's trick to calculate bit masks without looping.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* fix shift overflow regression in previous commit
Jamey pointed out my mistake: I didn't consider the case when the flag count was
evenly divisible by the representation size. This fixes the problem and adds
test cases to cover it.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* support enums with more than 256 variants in derive macro
This addresses #4361. Technically, we now support up to 2^32 variants, which is
the maximum for the canonical ABI. In practice, though, the derived code for
enums with even just 2^16 variants takes a prohibitively long time to compile.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* simplify `LowerExpander::expand_variant` code
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* Upgrade all crates to the Rust 2021 edition
I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.
* Fix compile of the C API
* Fix a warning
* Fix another warning
This PR introduces a new way of performing cooperative timeslicing that
is intended to replace the "fuel" mechanism. The tradeoff is that this
mechanism interrupts with less precision: not at deterministic points
where fuel runs out, but rather when the Engine enters a new epoch. The
generated code instrumentation is substantially faster, however, because
it does not need to do as much work as when tracking fuel; it only loads
the global "epoch counter" and does a compare-and-branch at backedges
and function prologues.
This change has been measured as ~twice as fast as fuel-based
timeslicing for some workloads, especially control-flow-intensive
workloads such as the SpiderMonkey JS interpreter on Wasm/WASI.
The intended interface is that the embedder of the `Engine` performs an
`engine.increment_epoch()` call periodically, e.g. once per millisecond.
An async invocation of a Wasm guest on a `Store` can specify a number of
epoch-ticks that are allowed before an async yield back to the
executor's event loop. (The initial amount and automatic "refills" are
configured on the `Store`, just as for fuel.) This call does only
signal-safe work (it increments an `AtomicU64`) so could be invoked from
a periodic signal, or from a thread that wakes up once per period.
I had no idea this was still in the repository, much less building!
There are much different ways to use wasmtime in Rust nowadays, such as
the `wasmtime` crate!
* wasmtime-wasi: re-exporting this WasiCtxBuilder was shadowing the right one
wasi-common's WasiCtxBuilder is really only useful wasi_cap_std_sync and
wasi_tokio to implement their own Builder on top of.
This re-export of wasi-common's is 1. not useful and 2. shadow's the
re-export of the right one in sync::*.
* wasi-common: eliminate WasiCtxBuilder, make the builder methods on WasiCtx instead
* delete wasi-common::WasiCtxBuilder altogether
just put those methods directly on &mut WasiCtx.
As a bonus, the sync and tokio WasiCtxBuilder::build functions
are no longer fallible!
* bench fixes
* more test fixes
* Add support for the experimental wasi-crypto APIs
The sole purpose of the implementation is to allow bindings and
application developers to test the proposed APIs.
Rust and AssemblyScript bindings are also available as examples.
Like `wasi-nn`, it is currently disabled by default, and requires
the `wasi-crypto` feature flag to be compiled in.
* Rename the wasi-crypto/spec submodule
* Add a path dependency into the submodule for wasi-crypto
* Tell the publish script to vendor wasi-crypto
We were accidentally always running the `fib-debug/main` example because of
shenanigans with alphabetical ordering and hard coding "main.exe" as the command
we run. Now we properly detect which example we built and run the appropriate
executable.
* Moves CodeMemory, VMInterrupts and SignatureRegistry from Compiler
* CompiledModule holds CodeMemory and GdbJitImageRegistration
* Store keeps track of its JIT code
* Makes "jit_int.rs" stuff Send+Sync
* Adds the threads example.
The `wasmtime` crate currently lives in `crates/api` for historical
reasons, because we once called it `wasmtime-api` crate. This creates a
stumbling block for new contributors.
As discussed on Zulip, rename the directory to `crates/wasmtime`.
This commit removes the .NET implementation from Wasmtime.
It now exists at https://github.com/bytecodealliance/wasmtime-dotnet.
Also updates the Wasmtime book to include information about using Wasmtime from
.NET.
* Add Wasmtime-specific C API functions to return errors
This commit adds new `wasmtime_*` symbols to the C API, many of which
mirror the existing counterparts in the `wasm.h` header. These APIs are
enhanced in a number of respects:
* Detailed error information is now available through a
`wasmtime_error_t`. Currently this only exposes one function which is
to extract a string version of the error.
* There is a distinction now between traps and errors during
instantiation and function calling. Traps only happen if wasm traps,
and errors can happen for things like runtime type errors when
interacting with the API.
* APIs have improved safety with respect to embedders where the lengths
of arrays are now taken as explicit parameters rather than assumed
from other parameters.
* Handle trap updates
* Update C examples
* Fix memory.c compile on MSVC
* Update test assertions
* Refactor C slightly
* Bare-bones .NET update
* Remove bogus nul handling
* Wasmtime 0.15.0 and Cranelift 0.62.0. (#1398)
* Bump more ad-hoc versions.
* Add build.rs to wasi-common's Cargo.toml.
* Update the env var name in more places.
* Remove a redundant echo.
* Remove the wasmtime Python extension from this repo
This commit removes the `crates/misc/py` folder and all associated
doo-dads like CI. This module has been rewritten to use the C API
natively and now lives at
https://github.com/bytecodealliance/wasmtime-py as discussed on #1390
* Bump Wasmtime to 0.14.0.
* Update the publish script for the wiggle crate wiggle.
* More fixes.
* Fix lightbeam depenency version.
* cargo update
* Cargo update wasi-tests too.
And add cargo update to the version-bump scripts.
This commit reimplements the C# API in terms of a Wasmtime linker.
It removes the custom binding implementation that was based on reflection in
favor of the linker's implementation.
This should make the C# API a little closer to the Rust API.
The `Engine` and `Store` types have been hidden behind the `Host` type which is
responsible for hosting WebAssembly module instances.
Documentation and tests have been updated.
This commit adds support for snapshot0 in the WASI C API.
A name parameter was added to `wasi_instance_new` to accept which WASI module
is being instantiated.
Additionally, the C# API now supports constructing a WASI instance based on the
WASI module name.
Fixes#1221.
* Fix instructions for building rust modules in python examples
When I ran `rustc +nightly ...` the compiler just looked for a source file
called `+nightly`. I changed these instructions to use rustup + rustc instead.
* Initial documentation for python users
Added documentation for using the Wasmtime loader in python, and explained the
first two examples in the repo. Changed the import example to demonstrate
working with module linear memory.
* Fix include in python guide
* Wording
* Clarify memory usage
* Flow through the example better
* More word choice
* Make rustup a prereq
* Fix source code paths in python guide
* Fix rustup example in python guide
Co-Authored-By: Samrat Man Singh <samratmansingh@gmail.com>
* Replace command examples with preformat blocks
* Revert "Fix instructions for building rust modules in python examples"
This reverts commit 1738888a2df4e15aba1e26c8ef42058e7a2053bb.
* Left a block quote in a preformat example
Co-authored-by: Samrat Man Singh <samratmansingh@gmail.com>
* Add a .gitattributes file specifying LF-style line endings.
This is similar to [Rust's .gitattributes file] though simplified.
Most of our source and documentation files already used LF-style line
endings, including *.cs files, so this makes things more consistent.
[Rust's .gitattributes file]: https://github.com/rust-lang/rust/blob/master/.gitattributes
* Remove UTF-8 BOMs in *.cs files.
Most of our *.cs files don't have UTF-8 BOMs, so this makes things more
consistent.