This commit builds on bytecodealliance/wasm-tools#690 to add support to
testing of the component model to execute functions when running
`*.wast` files. This support is all built on #4442 as functions are
invoked through a "dynamic" API. Right now the testing and integration
is fairly crude but I'm hoping that we can try to improve it over time
as necessary. For now this should provide a hopefully more convenient
syntax for unit tests and the like.
* Implement variant translation in fused adapters
This commit implements the most general case of variants for fused
adapter trampolines. Additionally a number of other primitive types are
filled out here to assist with testing variants. The implementation
internally was relatively straightforward given the shape of variants,
but there's room for future optimization as necessary especially around
converting locals to various types.
This commit also introduces a "one off" fuzzer for adapters to ensure
that the generated adapter is valid. I hope to extend this fuzz
generator as more types are implemented to assist in various corner
cases that might arise. For now the fuzzer simply tests that the output
wasm module is valid, not that it actually executes correctly. I hope to
integrate with a fuzzer along the lines of #4307 one day to test the
run-time-correctness of the generated adapters as well, at which point
this fuzzer would become obsolete.
Finally this commit also fixes an issue with `u8` translation where
upper bits weren't zero'd out and were passed raw across modules.
Instead smaller-than-32 types now all mask out their upper bits and do
sign-extension as appropriate for unsigned/signed variants.
* Fuzz memory64 in the new trampoline fuzzer
Currently memory64 isn't supported elsewhere in the component model
implementation of Wasmtime but the trampoline compiler seems as good a
place as any to ensure that it at least works in isolation. This plumbs
through fuzz input into a `memory64` boolean which gets fed into
compilation. Some miscellaneous bugs were fixed as a result to ensure
that memory64 trampolines all validate correctly.
* Tweak manifest for doc build
* Remove dependency on `more-asserts`
In my recent adventures to do a bit of gardening on our dependencies I
noticed that there's a new major version for the `more-asserts` crate.
Instead of updating to this though I've opted to instead remove the
dependency since I don't think we heavily lean on this crate and
otherwise one-off prints are probably sufficient to avoid the need for
pulling in a whole crate for this.
* Remove exemption for `more-asserts`
* Add initial support for fused adapter trampolines
This commit lands a significant new piece of functionality to Wasmtime's
implementation of the component model in the form of the implementation
of fused adapter trampolines. Internally within a component core wasm
modules can communicate with each other by having their exports
`canon lift`'d to get `canon lower`'d into a different component. This
signifies that two components are communicating through a statically
known interface via the canonical ABI at this time. Previously Wasmtime
was able to identify that this communication was happening but it simply
panicked with `unimplemented!` upon seeing it. This commit is the
beginning of filling out this panic location with an actual
implementation.
The implementation route chosen here for fused adapters is to use a
WebAssembly module itself for the implementation. This means that, at
compile time of a component, Wasmtime is generating core WebAssembly
modules which then get recursively compiled within Wasmtime as well. The
choice to use WebAssembly itself as the implementation of fused adapters
stems from a few motivations:
* This does not represent a significant increase in the "trusted
compiler base" of Wasmtime. Getting the Wasm -> CLIF translation
correct once is hard enough much less for an entirely different IR to
CLIF. By generating WebAssembly no new interactions with Cranelift are
added which drastically reduces the possibilities for mistakes.
* Using WebAssembly means that component adapters are insulated from
miscompilations and mistakes. If something goes wrong it's defined
well within the WebAssembly specification how it goes wrong and what
happens as a result. This means that the "blast zone" for a wrong
adapter is the component instance but not the entire host itself.
Accesses to linear memory are guaranteed to be in-bounds and otherwise
handled via well-defined traps.
* A fully-finished fused adapter compiler is expected to be a
significant and quite complex component of Wasmtime. Functionality
along these lines is expected to be needed for Web-based polyfills of
the component model and by using core WebAssembly it provides the
opportunity to share code between Wasmtime and these polyfills for the
component model.
* Finally the runtime implementation of managing WebAssembly modules is
already implemented and quite easy to integrate with, so representing
fused adapters with WebAssembly results in very little extra support
necessary for the runtime implementation of instantiating and managing
a component.
The compiler added in this commit is dubbed Wasmtime's Fused Adapter
Compiler of Trampolines (FACT) because who doesn't like deriving a name
from an acronym. Currently the trampoline compiler is limited in its
support for interface types and only supports a few primitives. I plan
on filing future PRs to flesh out the support here for all the variants
of `InterfaceType`. For now this PR is primarily focused on all of the
other infrastructure for the addition of a trampoline compiler.
With the choice to use core WebAssembly to implement fused adapters it
means that adapters need to be inserted into a module. Unfortunately
adapters cannot all go into a single WebAssembly module because adapters
themselves have dependencies which may be provided transitively through
instances that were instantiated with other adapters. This means that a
significant chunk of this PR (`adapt.rs`) is dedicated to determining
precisely which adapters go into precisely which adapter modules. This
partitioning process attempts to make large modules wherever it can to
cut down on core wasm instantiations but is likely not optimal as
it's just a simple heuristic today.
With all of this added together it's now possible to start writing
`*.wast` tests that internally have adapted modules communicating with
one another. A `fused.wast` test suite was added as part of this PR
which is the beginning of tests for the support of the fused adapter
compiler added in this PR. Currently this is primarily testing some
various topologies of adapters along with direct/indirect modes. This
will grow many more tests over time as more types are supported.
Overall I'm not 100% satisfied with the testing story of this PR. When a
test fails it's very difficult to debug since everything is written in
the text format of WebAssembly meaning there's no "conveniences" to
print out the state of the world when things go wrong and easily debug.
I think this will become even more apparent as more tests are written
for more types in subsequent PRs. At this time though I know of no
better alternative other than leaning pretty heavily on fuzz-testing to
ensure this is all exercised.
* Fix an unused field warning
* Fix tests in `wasmtime-runtime`
* Add some more tests for compiled trampolines
* Remap exports when injecting adapters
The exports of a component were accidentally left unmapped which meant
that they indexed the instance indexes pre-adapter module insertion.
* Fix typo
* Rebase conflicts
* Components: ignore type exports (for now).
This commit updates component translation to ignore type exports for now.
Components generated with `wit-component` contain type exports to give names to
types used within the component's functions based on the component's wit
definition.
The intention is to allow bindings to be generated with meaningful names
directly from a component. In the future, type exports (and imports) may be
used for more than this purpose to support things like resource types.
This commit effectively ignores type exports when translating the component as
they are not useful to executing a component at this time.
Closes#4415.
* Code review feedback.
This commit augments the current translation phase of components with
extra machinery to track the type information of component items such as
instances, components, and functions. The end goal of this commit is to
enable the `Lower` instruction to know the type of the component
function being lowered. Currently during the inlining pass where
component fusion is detected the type of the lifted function is known,
but to implement fusion entirely the type of the lowered function must
be known. Note that these two types are expected to be different to
allow for the subtyping rules specified by the component model.
For now nothing is actually done with this information other than noting
its presence in the face of a lifted-then-lowered function. My hope
though was to split this out for a separate review to avoid making a
future component-adapter-compiler-containing-PR too large.
* wasmtime: Rename host->wasm trampolines
As we introduce new types of trampolines, having clear names for our existing
trampolines will be helpful.
* Fix typo in docs for `VMCOMPONENT_MAGIC`
* wasmtime: Add criterion micro benchmarks for traps
* Bump versions of wasm-tools crates
Note that this leaves new features in the component model, outer type
aliases for core wasm types, unimplemented for now.
* Move to crates.io-based versions of tools
This commit adds support to Wasmtime for components which themselves
export instances. The support here adds new APIs for how instance
exports are accessed in the embedding API. For now this is mostly just a
first-pass where the API is somewhat confusing and has a lot of
lifetimes. I'm hoping that over time we can figure out how to simplify
this but for now it should at least be expressive enough for exploring
the exports of an instance.
* Implement `canon lower` of a `canon lift` function in the same component
This commit implements the "degenerate" logic for implementing a
function within a component that is lifted and then immediately lowered
again. In this situation the lowered function will immediately generate
a trap and doesn't need to implement anything else.
The implementation in this commit is somewhat heavyweight but I think is
probably justified moreso in future additions to the component model
rather than what exactly is here right now. It's not expected that this
"always trap" functionality will really be used all that often since it
would generally mean a buggy component, but the functionality plumbed
through here is hopefully going to be useful for implementing
component-to-component adapter trampolines.
Specifically this commit implements a strategy where the `canon.lower`'d
function is generated by Cranelift and simply has a single trap
instruction when called, doing nothing else. The main complexity comes
from juggling around all the data associated with these functions,
primarily plumbing through the traps into the `ModuleRegistry` to
ensure that the global `is_wasm_trap_pc` function returns `true` and at
runtime when we lookup information about the trap it's all readily
available (e.g. translating the trapping pc to a `TrapCode`).
* Fix non-component build
* Fix some offset calculations
* Only create one "always trap" per signature
Use an internal map to deduplicate during compilation.
Currently I don't know how we can reasonably implement this. Given all
the signatures of how we call functions and how functions are called on
the host there's no real feasible way that I know of to hook these two
up "seamlessly". This means that a component which reexports an imported
function can't be run in Wasmtime.
One of the main reasons for this is that when calling a component
function Wasmtime wants to lower arguments first and then have them
lifted when the host is called. With a reexport though there's not
actually anything to lower into so we'd sort of need something similar
to a table on the side or maybe a linear memory and that seems like it'd
get quite complicated quite quickly for not really all that much
benefit. As-such for now this simply returns a first-class error (rather
than the current panic) in situations like this.
* Implement lowered-then-lifted functions
This commit is a few features bundled into one, culminating in the
implementation of lowered-then-lifted functions for the component model.
It's probably not going to be used all that often but this is possible
within a valid component so Wasmtime needs to do something relatively
reasonable. The main things implemented in this commit are:
* Component instances are now assigned a `RuntimeComponentInstanceIndex`
to differentiate each one. This will be used in the future to detect
fusion (one instance lowering a function from another instance). For
now it's used to allocate separate `VMComponentFlags` for each
internal component instance.
* The `CoreExport<FuncIndex>` of lowered functions was changed to a
`CoreDef` since technically a lowered function can use another lowered
function as the callee. This ended up being not too difficult to plumb
through as everything else was already in place.
* A need arose to compile host-to-wasm trampolines which weren't already
present. Currently wasm in a component is always entered through a
host-to-wasm trampoline but core wasm modules are the source of all
the trampolines. In the case of a lowered-then-lifted function there
may not actually be any core wasm modules, so component objects now
contain necessary trampolines not otherwise provided by the core wasm
objects. This feature required splitting a new function into the
`Compiler` trait for creating a host-to-wasm trampoline. After doing
this core wasm compilation was also updated to leverage this which
further enabled compiling trampolines in parallel as opposed to the
previous synchronous compilation.
* Review comments
This commit implements the `post-return` feature of the canonical ABI in
the component model. This attribute is an optionally-specified function
which is to be executed after the return value has been processed by the
caller to optionally clean-up the return value. This enables, for
example, returning an allocated string and the host then knows how to
clean it up to prevent memory leaks in the original module.
The API exposed in this PR changes the prior `TypedFunc::call` API in
behavior but not in its signature. Previously the `TypedFunc::call`
method would set the `may_enter` flag on the way out, but now that
operation is deferred until a new `TypedFunc::post_return` method is
called. This means that once a method on an instance is invoked then
nothing else can be done on the instance until the `post_return` method
is called. Note that the method must be called irrespective of whether
the `post-return` canonical ABI option was specified or not. Internally
wasm will be invoked if necessary.
This is a pretty wonky and unergonomic API to work with. For now I
couldn't think of a better alternative that improved on the ergonomics.
In the theory that the raw Wasmtime bindings for a component may not be
used all that heavily (instead `wit-bindgen` would largely be used) I'm
hoping that this isn't too much of an issue in the future.
cc #4185
* Add support for nested components
This commit is an implementation of a number of features of the
component model including:
* Defining nested components
* Outer aliases to components and modules
* Instantiating nested components
The implementation here is intended to be a foundational pillar of
Wasmtime's component model support since recursion and nested components
are the bread-and-butter of the component model. At a high level the
intention for the component model implementation in Wasmtime has long
been that the recursive nature of components is "erased" at compile time
to something that's more optimized and efficient to process. This commit
ended up exemplifying this quite well where the vast majority of the
internal changes here are in the "compilation" phase of a component
rather than the runtime instantiation phase. The support in the
`wasmtime` crate, the runtime instantiation support, only had minor
updates here while the internals of translation have seen heavy updates.
The `translate` module was greatly refactored here in this commit.
Previously it would, as a component is parsed, create a final
`Component` to hand off to trampoline compilation and get persisted at
runtime. Instead now it's a thin layer over `wasmparser` which simply
records a list of `LocalInitializer` entries for how to instantiate the
component and its index spaces are built. This internal representation
of the instantiation of a component is pretty close to the binary format
intentionally.
Instead of performing dataflow legwork the `translate` phase of a
component is now responsible for two primary tasks:
1. All components and modules are discovered within a component. They're
assigned `Static{Component,Module}Index` depending on where they're
found and a `{Module,}Translation` is prepared for each one. This
"flattens" the recursive structure of the binary into an indexed list
processable later.
2. The lexical scope of components is managed here to implement outer
module and component aliases. This is a significant design
implementation because when closing over an outer component or module
that item may actually be imported or something like the result of a
previous instantiation. This means that the capture of
modules and components is both a lexical concern as well as a runtime
concern. The handling of the "runtime" bits are handled in the next
phase of compilation.
The next and currently final phase of compilation is a new pass where
much of the historical code in `translate.rs` has been moved to (but
heavily refactored). The goal of compilation is to produce one "flat"
list of initializers for a component (as happens prior to this PR) and
to achieve this an "inliner" phase runs which runs through the
instantiation process at compile time to produce a list of initializers.
This `inline` module is the main addition as part of this PR and is now
the workhorse for dataflow analysis and tracking what's actually
referring to what.
During the `inline` phase the local initializers recorded in the
`translate` phase are processed, in sequence, to instantiate a
component. Definitions of items are tracked to correspond to their root
definition which allows seeing across instantiation argument boundaries
and such. Handling "upvars" for component outer aliases is handled in
the `inline` phase as well by creating state for a component whenever a
component is defined as was recorded during the `translate` phase.
Finally this phase is chiefly responsible for doing all string-based
name resolution at compile time that it can. This means that at runtime
no string maps will need to be consulted for item exports and such.
The final result of inlining is a list of "global initializers" which is
a flat list processed during instantiation time. These are almost
identical to the initializers that were processed prior to this PR.
There are certainly still more gaps of the component model to implement
but this should be a major leg up in terms of functionality that
Wasmtime implements. This commit, however leaves behind a "hole" which
is not intended to be filled in at this time, namely importing and
exporting components at the "root" level from and to the host. This is
tracked and explained in more detail as part of #4283.
cc #4185 as this completes a number of items there
* Tweak code to work on stable without warning
* Review comments
This commit refactored `Config` to use a seperate `CompilerConfig` field instead
of operating on `CompilerBuilder` directly to make all its methods idempotent.
Fixes#4189
When adding shared memory, memories owned by the module were added to a
`owned_memories` array placed immediately after the `defined_memories`
array. When checking the size of each array with `region_sizes`, the
size of `defined_memories` and `owned_memories` were checked in this
order. But `region_sizes` is iterating through the fields in the reverse
order. This change reverses the field order to fix the associated fuzz
bug.
This commit updates the wasm-tools family of crates, notably pulling in
the refactorings and updates from bytecodealliance/wasm-tools#621 for
the latest iteration of the component model. This commit additionally
updates all support for the component model for these changes, notably:
* Many bits and pieces of type information was refactored. Many
`FooTypeIndex` namings are now `TypeFooIndex`. Additionally there is
now `TypeIndex` as well as `ComponentTypeIndex` for the two type index
spaces in a component.
* A number of new sections are now processed to handle the core and
component variants.
* Internal maps were split such as the `funcs` map into
`component_funcs` and `funcs` (same for `instances`).
* Canonical options are now processed individually instead of one bulk
`into` definition.
Overall this was not a major update to the internals of handling the
component model in Wasmtime. Instead this was mostly a surface-level
refactoring to make sure that everything lines up with the new binary
format for components.
* All text syntax used in tests was updated to the new syntax.
* Add shared memories
This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated as `shared`
in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are
protected from concurrent access during `memory.size` and `memory.grow`.
[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md
In order to implement this in Wasmtime, there are two main cases to
cover:
- a program may simply create a shared memory and possibly export it;
this means that Wasmtime itself must be able to create shared
memories
- a user may create a shared memory externally and pass it in as an
import during instantiation; this is the case when the program
contains code like `(import "env" "memory" (memory 1 1
shared))`--this case is handled by a new Wasmtime API
type--`SharedMemory`
Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
`Memory` runtime container will lock appropriately during `memory.size`
and `memory.grow` operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.
The second case is covered by the new `SharedMemory` structure. It uses
the same `Mmap` allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an `Arc`). The pointer address to
the actual memory is carefully wired through and owned by the
`SharedMemory` structure itself. This means that there are differing
views of where to access the pointer (i.e., `VMMemoryDefinition`): for
owned memories (the default), the `VMMemoryDefinition` is stored
directly by the `VMContext`; in the `SharedMemory` case, however, this
`VMContext` must point to this separate structure.
To ensure that the `VMContext` can always point to the correct
`VMMemoryDefinition`, this change alters the `VMContext` structure.
Since a `SharedMemory` owns its own `VMMemoryDefinition`, the
`defined_memories` table in the `VMContext` becomes a sequence of
pointers--in the shared memory case, they point to the
`VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory
case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a
new table, `owned_memories`.
This change adds an additional indirection (through the `*mut
VMMemoryDefinition` pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
`pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the `owned_memories` entry) for `memory.size` so that only
shared memories (and imported memories, as before) incur the indirection
cost.
* review: remove thread feature check
* review: swap wasmtime-types dependency for existing wasmtime-environ use
* review: remove unused VMMemoryUnion
* review: reword cross-engine error message
* review: improve tests
* review: refactor to separate prevent Memory <-> SharedMemory conversion
* review: into_shared_memory -> as_shared_memory
* review: remove commented out code
* review: limit shared min/max to 32 bits
* review: skip imported memories
* review: imported memories are not owned
* review: remove TODO
* review: document unsafe send + sync
* review: add limiter assertion
* review: remove TODO
* review: improve tests
* review: fix doc test
* fix: fixes based on discussion with Alex
This changes several key parts:
- adds memory indexes to imports and exports
- makes `VMMemoryDefinition::current_length` an atomic usize
* review: add `Extern::SharedMemory`
* review: remove TODO
* review: atomically load from VMMemoryDescription in JIT-generated code
* review: add test probing the last available memory slot across threads
* fix: move assertion to new location due to rebase
* fix: doc link
* fix: add TODOs to c-api
* fix: broken doc link
* fix: modify pooling allocator messages in tests
* review: make owned_memory_index panic instead of returning an option
* review: clarify calculation of num_owned_memories
* review: move 'use' to top of file
* review: change '*const [u8]' to '*mut [u8]'
* review: remove TODO
* review: avoid hard-coding memory index
* review: remove 'preallocation' parameter from 'Memory::_new'
* fix: component model memory length
* review: check that shared memory plans are static
* review: ignore growth limits for shared memory
* review: improve atomic store comment
* review: add FIXME for memory growth failure
* review: add comment about absence of bounds-checked 'memory.size'
* review: make 'current_length()' doc comment more precise
* review: more comments related to memory.size non-determinism
* review: make 'vmmemory' unreachable for shared memory
* review: move code around
* review: thread plan through to 'wrap()'
* review: disallow shared memory allocation with the pooling allocator
* Add a `VMComponentContext` type and create it on instantiation
This commit fills out the `wasmtime-runtime` crate's support for
`VMComponentContext` and creates it as part of the instantiation
process. This moves a few maps that were temporarily allocated in an
`InstanceData` into the `VMComponentContext` and additionally reads the
canonical options data from there instead.
This type still won't be used in its "full glory" until the lowering of
host functions is completely implemented, however, which will be coming
in a future commit.
* Remove `DerefMut` implementation
* Rebase conflicts
* Add trampoline compilation support for lowered imports
This commit adds support to the component model implementation for
compiling trampolines suitable for calling host imports. Currently this
is purely just the compilation side of things, modifying the
wasmtime-cranelift crate and additionally filling out a new
`VMComponentOffsets` type (similar to `VMOffsets`). The actual creation
of a `VMComponentContext` is still not performed and will be a
subsequent PR.
Internally though some tests are actually possible with this where we at
least assert that compilation of a component and creation of everything
in-memory doesn't panic or trip any assertions, so some tests are added
here for that as well.
* Fix some test errors
* Implement module imports into components
As a step towards implementing function imports into a component this
commit implements importing modules into a component. This fills out
missing pieces of functionality such as exporting modules as well. The
previous translation code had initial support for translating imported
modules but some of the AST type information was restructured with
feedback from this implementation, namely splitting the
`InstantiateModule` initializer into separate upvar/import variants to
clarify that the item orderings for imports are resolved differently at
runtime.
Much of this commit is also adding infrastructure for any imports at all
into a component. For example a `Linker` type (analagous to
`wasmtime::Linker`) was added here as well. For now this type is quite
limited due to the inability to define host functions (it can only work
with instances and instances-of-modules) but it's enough to start
writing `*.wast` tests which exercise lots of module-related functionality.
* Fix a warning
* Fix double-counting imports in `VMOffsets` calculations
This fixes an oversight in the initial creation of `VMOffsets` for a
module to avoid double-counting imported globals, tables, and memories
for calculating the size of the `VMContext`. Prior to this PR imported
items are accidentally also counted as defined items for sizing
calculations meaning that when a memory is imported but not defined, for
example, the `VMContext` will have a space for an inline
`VMMemoryDefinition` when it doesn't need to.
Auditing where all this relates to it appears that the only issue from
this mistake is that `VMContext` is a bit larger than it would otherwise
need to be. Extra slots are uninitialized memory but nothing in Wasmtime
ever actually accesses the memory either, so it should be harmless to
have extra space here. Nevertheless it seems better to shrink the size
as much as possible to avoid wasting space where we can.
* Fix tests
This commit enhances the processing of components to track all the
dataflow for the processing of `canon.lower`'d functions. At the same
time this fills out a few other missing details to component processing
such as aliasing from some kinds of component instances and similar.
The major changes contained within this are the updates the `info`
submodule which has the AST of component type information. This has been
significantly refactored to prepare for representing lowered functions
and implementing those. The major change is from an `Instantiation` list
to an `Initializer` list which abstractly represents a few other
initialization actions.
This work is split off from my main work to implement component imports
of host functions. This is incomplete in the sense that it doesn't
actually finish everything necessary to define host functions and import
them into components. Instead this is only the changes necessary at the
translation layer (so far). Consequently this commit does not have tests
and also namely doesn't actually include the `VMComponentContext`
initialization and usage. The full body of work is still a bit too messy
to PR just yet so I'm hoping that this is a slimmed-down-enough piece to
adequately be reviewed.
* Split `wasm_to_host_trampoline` into pieces
In the upcoming component model supoprt for imports my plan is to reuse
some of these pieces but not the entirety of the current
`wasm_to_host_trampoline`. In an effort to make that diff smaller this
commit splits up the function preemptively into pieces to get reused
later.
* Delete unused `for_each_libcall` macros
Came across this when working in the object support for cranelift.
* Refactor some object creation details
This commit refactors some of the internals around creating an object
file in the wasmtime-cranelift integration. The old `ObjectBuilder` is
now named `ModuleTextBuilder` and is only used to create the text
section rather than other sections too. This helps maintain the
invariant that the unwind information section is placed directly after
the text section without having an odd API for doing this.
Additionally the unwind information creation is moved externally from
the `ModuleTextBuilder` to a standalone structure. This separate
structure is currently in use in the component model work I'm doing
although I may change that to using the `ModuleTextBuilder` instead. In
any case it seemed nice to encapsulate all of the unwinding information
into one standalone structure.
Finally, the insertion of native debug information has been refactored
to happen in a new `append_dwarf` method to keep all the dwarf-related
stuff together in one place as much as possible.
* Fix a doctest
* Fix a typo
* Change some `VMContext` pointers to `()` pointers
This commit is motivated by my work on the component model
implementation for imported functions. Currently all context pointers in
wasm are `*mut VMContext` but with the component model my plan is to
make some pointers instead along the lines of `*mut VMComponentContext`.
In doing this though one worry I have is breaking what has otherwise
been a core invariant of Wasmtime for quite some time, subtly
introducing bugs by accident.
To help assuage my worry I've opted here to erase knowledge of
`*mut VMContext` where possible. Instead where applicable a context
pointer is simply known as `*mut ()` and the embedder doesn't actually
know anything about this context beyond the value of the pointer. This
will help prevent Wasmtime from accidentally ever trying to interpret
this context pointer as an actual `VMContext` when it might instead be a
`VMComponentContext`.
Overall this was a pretty smooth transition. The main change here is
that the `VMTrampoline` (now sporting more docs) has its first argument
changed to `*mut ()`. The second argument, the caller context, is still
configured as `*mut VMContext` though because all functions are always
called from wasm still. Eventually for component-to-component calls I
think we'll probably "fake" the second argument as the same as the first
argument, losing track of the original caller, as an intentional way of
isolating components from each other.
Along the way there are a few host locations which do actually assume
that the first argument is indeed a `VMContext`. These are valid
assumptions that are upheld from a correct implementation, but I opted
to add a "magic" field to `VMContext` to assert this in debug mode. This
new "magic" field is inintialized during normal vmcontext initialization
and it's checked whenever a `VMContext` is reinterpreted as an
`Instance` (but only in debug mode). My hope here is to catch any future
accidental mistakes, if ever.
* Use a VMOpaqueContext wrapper
* Fix typos
* components: Implement the ability to call component exports
This commit is an implementation of the typed method of calling
component exports. This is intended to represent the most efficient way
of calling a component in Wasmtime, similar to what `TypedFunc`
represents today for core wasm.
Internally this contains all the traits and implementations necessary to
invoke component exports with any type signature (e.g. arbitrary
parameters and/or results). The expectation is that for results we'll
reuse all of this infrastructure except in reverse (arguments and
results will be swapped when defining imports).
Some features of this implementation are:
* Arbitrary type hierarchies are supported
* The Rust-standard `Option`, `Result`, `String`, `Vec<T>`, and tuple
types all map down to the corresponding type in the component model.
* Basic utf-16 string support is implemented as proof-of-concept to show
what handling might look like. This will need further testing and
benchmarking.
* Arguments can be behind "smart pointers", so for example
`&Rc<Arc<[u8]>>` corresponds to `list<u8>` in interface types.
* Bulk copies from linear memory never happen unless explicitly
instructed to do so.
The goal of this commit is to create the ability to actually invoke wasm
components. This represents what is expected to be the performance
threshold for these calls where it ideally should be optimal how
WebAssembly is invoked. One major missing piece of this is a `#[derive]`
of some sort to generate Rust types for arbitrary `*.wit` types such as
custom records, variants, flags, unions, etc. The current trait impls
for tuples and `Result<T, E>` are expected to have fleshed out most of
what such a derive would look like.
There are some downsides and missing pieces to this commit and method of
calling components, however, such as:
* Passing `&[u8]` to WebAssembly is currently not optimal. Ideally this
compiles down to a `memcpy`-equivalent somewhere but that currently
doesn't happen due to all the bounds checks of copying data into
memory. I have been unsuccessful so far at getting these bounds checks
to be removed.
* There is no finalization at this time (the "post return" functionality
in the canonical ABI). Implementing this should be relatively
straightforward but at this time requires `wasmparser` changes to
catch up with the current canonical ABI.
* There is no guarantee that results of a wasm function will be
validated. As results are consumed they are validated but this means
that if function returns an invalid string which the host doesn't look
at then no trap will be generated. This is probably not the intended
semantics of hosts in the component model.
* At this time there's no support for memory64 memories, just a bunch of
`FIXME`s to get around to. It's expected that this won't be too
onerous, however. Some extra care will need to ensure that the various
methods related to size/alignment all optimize to the same thing they
do today (e.g. constants).
* The return value of a typed component function is either `T` or
`Value<T>`, and it depends on the ABI details of `T` and whether it
takes up more than one return value slot or not. This is an
ABI-implementation detail which is being forced through to the API
layer which is pretty unfortunate. For example if you say the return
value of a function is `(u8, u32)` then it's a runtime type-checking
error. I don't know of a great way to solve this at this time.
Overall I'm feeling optimistic about this trajectory of implementing
value lifting/lowering in Wasmtime. While there are a number of
downsides none seem completely insurmountable. There's naturally still a
good deal of work with the component model but this should be a
significant step up towards implementing and testing the component model.
* Review comments
* Write tests for calling functions
This commit adds a new test file for actually executing functions and
testing their results. This is not written as a `*.wast` test yet since
it's not 100% clear if that's the best way to do that for now (given
that dynamic signatures aren't supported yet). The tests themselves
could all largely be translated to `*.wast` testing in the future,
though, if supported.
Along the way a number of minor issues were fixed with lowerings with
the bugs exposed here.
* Fix an endian mistake
* Fix a typo and the `memory.fill` instruction
* Initial skeleton of some component model processing
This commit is the first of what will likely be many to implement the
component model proposal in Wasmtime. This will be structured as a
series of incremental commits, most of which haven't been written yet.
My hope is to make this incremental and over time to make this easier to
review and easier to test each step in isolation.
Here much of the skeleton of how components are going to work in
Wasmtime is sketched out. This is not a complete implementation of the
component model so it's not all that useful yet, but some things you can
do are:
* Process the type section into a representation amenable for working
with in Wasmtime.
* Process the module section and register core wasm modules.
* Process the instance section for core wasm modules.
* Process core wasm module imports.
* Process core wasm instance aliasing.
* Ability to compile a component with core wasm embedded.
* Ability to instantiate a component with no imports.
* Ability to get functions from this component.
This is already starting to diverge from the previous module linking
representation where a `Component` will try to avoid unnecessary
metadata about the component and instead internally only have the bare
minimum necessary to instantiate the module. My hope is we can avoid
constructing most of the index spaces during instantiation only for it
to all ge thrown away. Additionally I'm predicting that we'll need to
see through processing where possible to know how to generate adapters
and where they are fused.
At this time you can't actually call a component's functions, and that's
the next PR that I would like to make.
* Add tests for the component model support
This commit uses the recently updated wasm-tools crates to add tests for
the component model added in the previous commit. This involved updating
the `wasmtime-wast` crate for component-model changes. Currently the
component support there is quite primitive, but enough to at least
instantiate components and verify the internals of Wasmtime are all
working correctly. Additionally some simple tests for the embedding API
have also been added.
* Update the wasm-tools family of crates
This commit updates these crates as used by Wasmtime for the recently
published versions to pull in changes necessary to support the component
model. I've split this out from #4005 to make it clear what's impacted
here and #4005 can simply rebase on top of this to pick up the necessary
changes.
* More test fixes
This commit fixes an issue introduced in #4046 where the checks for
ensuring that the memory initialization image for a module was
constrained in its size failed to trigger and a very small module could
produce an arbitrarily large memory image.
The bug in question was that if a module only had empty data segments at
arbitrarily small and large addresses then the loop which checks whether
or not the image is allowed was skipped entirely since it was seen that
the memory had no data size. The fix here is to skip segments that are
empty to ensure that if the validation loop is skipped then no data
segments will be processed to create the image (and the module won't end
up having an image in the end).
* Remove the `Paged` memory initialization variant
This commit simplifies the `MemoryInitialization` enum by removing the
`Paged` variant. The `Paged` variant was originally added for uffd, but
that support has now been removed in #4040. This is no longer necessary
but is still used as an intermediate step of becoming a `Static` variant
of initialized memory (which copy-on-write uses). As a result this
commit largely modifies the static initialization of memory steps and
folds the two methods together.
* Apply suggestions from code review
Co-authored-by: Peter Huene <peter@huene.dev>
Co-authored-by: Peter Huene <peter@huene.dev>
* Run a `cargo update` over our dependencies
This'll notably fix a `cargo audit` error where we have a pinned version
of the `regex` crate which has a CVE assigned to it.
* Update to `object` and `hashbrown` crates
Prune some duplicate versions showing up from the previous `cargo update`
* Update wasm-tools crates
This commit updates the wasm-tools family of crates as used in Wasmtime.
Notably this brings in the update which removes module linking support
as well as a number of internal refactorings around names and such
within wasmparser itself. This updates all of the wasm translation
support which binds to wasmparser as appropriate.
Other crates all had API-compatible changes for at least what Wasmtime
used so no further changes were necessary beyond updating version
requirements.
* Update a test expectation
* Upgrade all crates to the Rust 2021 edition
I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.
* Fix compile of the C API
* Fix a warning
* Fix another warning
* Bump to 0.36.0
* Add a two-week delay to Wasmtime's release process
This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:
* In #3915 it was realized that changes just before the 0.35.0 release
weren't enough for an embedding use case, but the PR didn't meet the
expectations for a full patch release.
* At Fastly we were about to start rolling out a new version of Wasmtime
when over the weekend the fuzz bug #3951 was found. This led to the
desire internally to have a "must have been fuzzed for this long"
period of time for Wasmtime changes which we felt were better
reflected in the release process itself rather than something about
Fastly's own integration with Wasmtime.
This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.
Overall there are some new properties about Wasmtime with this proposal
as well:
* The `main` branch will always have a section in `RELEASES.md` which is
listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
release. For example it will be bump pre-emptively as part of the
release process on the 5th where if `release-2.0.0` was created then
the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
`RELEASES.md` notes.
The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.
* Add notes on a security patch
* Clarify security fixes shouldn't be previewed early on CI
* Remove duplicate `TypeTables` type
This was once needed historically but it is no longer needed.
* Make the internals of `TypeTables` private
Instead of reaching internally for the `wasm_signatures` map an `Index`
implementation now exists to indirect accesses through the type of the
index being accessed. For the component model this table of types will
grow a number of other tables and this'll assist in consuming sites not
having to worry so much about which map they're reaching into.
* Remove the module linking implementation in Wasmtime
This commit removes the experimental implementation of the module
linking WebAssembly proposal from Wasmtime. The module linking is no
longer intended for core WebAssembly but is instead incorporated into
the component model now at this point. This means that very large parts
of Wasmtime's implementation of module linking are no longer applicable
and would change greatly with an implementation of the component model.
The main purpose of this is to remove Wasmtime's reliance on the support
for module-linking in `wasmparser` and tooling crates. With this
reliance removed we can move over to the `component-model` branch of
`wasmparser` and use the updated support for the component model.
Additionally given the trajectory of the component model proposal the
embedding API of Wasmtime will not look like what it looks like today
for WebAssembly. For example the core wasm `Instance` will not change
and instead a `Component` is likely to be added instead.
Some more rationale for this is in #3941, but the basic idea is that I
feel that it's not going to be viable to develop support for the
component model on a non-`main` branch of Wasmtime. Additionaly I don't
think it's viable, for the same reasons as `wasm-tools`, to support the
old module linking proposal and the new component model at the same
time.
This commit takes a moment to not only delete the existing module
linking implementation but some abstractions are also simplified. For
example module serialization is a bit simpler that there's only one
module. Additionally instantiation is much simpler since the only
initializer we have to deal with are imports and nothing else.
Closes#3941
* Fix doc link
* Update comments
* Delete historical interruptable support in Wasmtime
This commit removes the `Config::interruptable` configuration along with
the `InterruptHandle` type from the `wasmtime` crate. The original
support for adding interruption to WebAssembly was added pretty early on
in the history of Wasmtime when there was no other method to prevent an
infinite loop from the host. Nowadays, however, there are alternative
methods for interruption such as fuel or epoch-based interruption.
One of the major downsides of `Config::interruptable` is that even when
it's not enabled it forces an atomic swap to happen when entering
WebAssembly code. This technically could be a non-atomic swap if the
configuration option isn't enabled but that produces even more branch-y
code on entry into WebAssembly which is already something we try to
optimize. Calling into WebAssembly is on the order of a dozens of
nanoseconds at this time and an atomic swap, even uncontended, can add
up to 5ns on some platforms.
The main goal of this PR is to remove this atomic swap on entry into
WebAssembly. This is done by removing the `Config::interruptable` field
entirely, moving all existing consumers to epochs instead which are
suitable for the same purposes. This means that the stack overflow check
is no longer entangled with the interruption check and perhaps one day
we could continue to optimize that further as well.
Some consequences of this change are:
* Epochs are now the only method of remote-thread interruption.
* There are no more Wasmtime traps that produces the `Interrupted` trap
code, although we may wish to move future traps to this so I left it
in place.
* The C API support for interrupt handles was also removed and bindings
for epoch methods were added.
* Function-entry checks for interruption are a tiny bit less efficient
since one check is performed for the stack limit and a second is
performed for the epoch as opposed to the `Config::interruptable`
style of bundling the stack limit and the interrupt check in one. It's
expected though that this is likely to not really be measurable.
* The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.
* Implement runtime checks for compilation settings
This commit fills out a few FIXME annotations by implementing run-time
checks that when a `Module` is created it has compatible codegen
settings for the current host (as `Module` is proof of "this code can
run"). This is done by implementing new `Engine`-level methods which
validate compiler settings. These settings are validated on
`Module::new` as well as when loading serialized modules.
Settings are split into two categories, one for "shared" top-level
settings and one for ISA-specific settings. Both categories now have
allow-lists hardcoded into `Engine` which indicate the acceptable values
for each setting (if applicable). ISA-specific settings are checked with
the Rust standard library's `std::is_x86_feature_detected!` macro. Other
macros for other platforms are not stable at this time but can be added
here if necessary.
Closes#3897
* Fix fall-through logic to actually be correct
* Use a `OnceCell`, not an `AtomicBool`
* Fix some broken tests
* Shrink the size of the anyfunc table in `VMContext`
This commit shrinks the size of the `VMCallerCheckedAnyfunc` table
allocated into a `VMContext` to be the size of the number of "escaped"
functions in a module rather than the number of functions in a module.
Escaped functions include exports, table elements, etc, and are
typically an order of magnitude smaller than the number of functions in
general. This should greatly shrink the `VMContext` for some modules
which while we aren't necessarily having any problems with that today
shouldn't cause any problems in the future.
The original motivation for this was that this came up during the recent
lazy-table-initialization work and while it no longer has a direct
performance benefit since tables aren't initialized at all on
instantiation it should still improve long-running instances
theoretically with smaller `VMContext` allocations as well as better
locality between anyfuncs.
* Fix some tests
* Remove redundant hash set
* Use a helper for pushing function type information
* Use a more descriptive `is_escaping` method
* Clarify a comment
* Fix condition
* Remove the `ModuleLimits` pooling configuration structure
This commit is an attempt to improve the usability of the pooling
allocator by removing the need to configure a `ModuleLimits` structure.
Internally this structure has limits on all forms of wasm constructs but
this largely bottoms out in the size of an allocation for an instance in
the instance pooling allocator. Maintaining this list of limits can be
cumbersome as modules may get tweaked over time and there's otherwise no
real reason to limit the number of globals in a module since the main
goal is to limit the memory consumption of a `VMContext` which can be
done with a memory allocation limit rather than fine-tuned control over
each maximum and minimum.
The new approach taken in this commit is to remove `ModuleLimits`. Some
fields, such as `tables`, `table_elements` , `memories`, and
`memory_pages` are moved to `InstanceLimits` since they're still
enforced at runtime. A new field `size` is added to `InstanceLimits`
which indicates, in bytes, the maximum size of the `VMContext`
allocation. If the size of a `VMContext` for a module exceeds this value
then instantiation will fail.
This involved adding a few more checks to `{Table, Memory}::new_static`
to ensure that the minimum size is able to fit in the allocation, since
previously modules were validated at compile time of the module that
everything fit and that validation no longer happens (it happens at
runtime).
A consequence of this commit is that Wasmtime will have no built-in way
to reject modules at compile time if they'll fail to be instantiated
within a particular pooling allocator configuration. Instead a module
must attempt instantiation see if a failure happens.
* Fix benchmark compiles
* Fix some doc links
* Fix a panic by ensuring modules have limited tables/memories
* Review comments
* Add back validation at `Module` time instantiation is possible
This allows for getting an early signal at compile time that a module
will never be instantiable in an engine with matching settings.
* Provide a better error message when sizes are exceeded
Improve the error message when an instance size exceeds the maximum by
providing a breakdown of where the bytes are all going and why the large
size is being requested.
* Try to fix test in qemu
* Flag new test as 64-bit only
Sizes are all specific to 64-bit right now
* fuzzing: Add a custom mutator based on `wasm-mutate`
* fuzz: Add a version of the `compile` fuzz target that uses `wasm-mutate`
* Update `wasmparser` dependencies
In #3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.
Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.
(Partially addresses #3820 but let's leave it open to track the hybrid
idea)