* Move `CompiledFunction` into wasmtime-cranelift
This commit moves the `wasmtime_environ::CompiledFunction` type into the
`wasmtime-cranelift` crate. This type has lots of Cranelift-specific
pieces of compilation and doesn't need to be generated by all Wasmtime
compilers. This replaces the usage in the `Compiler` trait with a
`Box<Any>` type that each compiler can select. Each compiler must still
produce a `FunctionInfo`, however, which is shared information we'll
deserialize for each module.
The `wasmtime-debug` crate is also folded into the `wasmtime-cranelift`
crate as a result of this commit. One possibility was to move the
`CompiledFunction` commit into its own crate and have `wasmtime-debug`
depend on that, but since `wasmtime-debug` is Cranelift-specific at this
time it didn't seem like it was too too necessary to keep it separate.
If `wasmtime-debug` supports other backends in the future we can
recreate a new crate, perhaps with it refactored to not depend on
Cranelift.
* Move wasmtime_environ::reference_type
This now belongs in wasmtime-cranelift and nowhere else
* Remove `Type` reexport in wasmtime-environ
One less dependency on `cranelift-codegen`!
* Remove `types` reexport from `wasmtime-environ`
Less cranelift!
* Remove `SourceLoc` from wasmtime-environ
Change the `srcloc`, `start_srcloc`, and `end_srcloc` fields to a custom
`FilePos` type instead of `ir::SourceLoc`. These are only used in a few
places so there's not much to lose from an extra abstraction for these
leaf use cases outside of cranelift.
* Remove wasmtime-environ's dep on cranelift's `StackMap`
This commit "clones" the `StackMap` data structure in to
`wasmtime-environ` to have an independent representation that that
chosen by Cranelift. This allows Wasmtime to decouple this runtime
dependency of stack map information and let the two evolve
independently, if necessary.
An alternative would be to refactor cranelift's implementation into a
separate crate and have wasmtime depend on that but it seemed a bit like
overkill to do so and easier to clone just a few lines for this.
* Define code offsets in wasmtime-environ with `u32`
Don't use Cranelift's `binemit::CodeOffset` alias to define this field
type since the `wasmtime-environ` crate will be losing the
`cranelift-codegen` dependency soon.
* Commit to using `cranelift-entity` in Wasmtime
This commit removes the reexport of `cranelift-entity` from the
`wasmtime-environ` crate and instead directly depends on the
`cranelift-entity` crate in all referencing crates. The original reason
for the reexport was to make cranelift version bumps easier since it's
less versions to change, but nowadays we have a script to do that.
Otherwise this encourages crates to use whatever they want from
`cranelift-entity` since we'll always depend on the whole crate.
It's expected that the `cranelift-entity` crate will continue to be a
lean crate in dependencies and suitable for use at both runtime and
compile time. Consequently there's no need to avoid its usage in
Wasmtime at runtime, since "remove Cranelift at compile time" is
primarily about the `cranelift-codegen` crate.
* Remove most uses of `cranelift-codegen` in `wasmtime-environ`
There's only one final use remaining, which is the reexport of
`TrapCode`, which will get handled later.
* Limit the glob-reexport of `cranelift_wasm`
This commit removes the glob reexport of `cranelift-wasm` from the
`wasmtime-environ` crate. This is intended to explicitly define what
we're reexporting and is a transitionary step to curtail the amount of
dependencies taken on `cranelift-wasm` throughout the codebase. For
example some functions used by debuginfo mapping are better imported
directly from the crate since they're Cranelift-specific. Note that
this is intended to be a temporary state affairs, soon this reexport
will be gone entirely.
Additionally this commit reduces imports from `cranelift_wasm` and also
primarily imports from `crate::wasm` within `wasmtime-environ` to get a
better sense of what's imported from where and what will need to be
shared.
* Extract types from cranelift-wasm to cranelift-wasm-types
This commit creates a new crate called `cranelift-wasm-types` and
extracts type definitions from the `cranelift-wasm` crate into this new
crate. The purpose of this crate is to be a shared definition of wasm
types that can be shared both by compilers (like Cranelift) as well as
wasm runtimes (e.g. Wasmtime). This new `cranelift-wasm-types` crate
doesn't depend on `cranelift-codegen` and is the final step in severing
the unconditional dependency from Wasmtime to `cranelift-codegen`.
The final refactoring in this commit is to then reexport this crate from
`wasmtime-environ`, delete the `cranelift-codegen` dependency, and then
update all `use` paths to point to these new types.
The main change of substance here is that the `TrapCode` enum is
mirrored from Cranelift into this `cranelift-wasm-types` crate. While
this unfortunately results in three definitions (one more which is
non-exhaustive in Wasmtime itself) it's hopefully not too onerous and
ideally something we can patch up in the future.
* Get lightbeam compiling
* Remove unnecessary dependency
* Fix compile with uffd
* Update publish script
* Fix more uffd tests
* Rename cranelift-wasm-types to wasmtime-types
This reflects the purpose a bit more where it's types specifically
intended for Wasmtime and its support.
* Fix publish script
* Bump the wasm-tools crates
Pulls in some updates here and there, mostly for updating crates to the
latest version to prepare for later memory64 work.
* Update lightbeam
This commit refactors module instantiation in the runtime to allow for
different instance allocation strategy implementations.
It adds an `InstanceAllocator` trait with the current implementation put behind
the `OnDemandInstanceAllocator` struct.
The Wasmtime API has been updated to allow a `Config` to have an instance
allocation strategy set which will determine how instances get allocated.
This change is in preparation for an alternative *pooling* instance allocator
that can reserve all needed host process address space in advance.
This commit also makes changes to the `wasmtime_environ` crate to represent
compiled modules in a way that reduces copying at instantiation time.
This commit updates the various tooling used by wasmtime which has new
updates to the module linking proposal. This is done primarily to sync
with WebAssembly/module-linking#26. The main change implemented here is
that wasmtime now supports creating instances from a set of values, nott
just from instantiating a module. Additionally subtyping handling of
modules with respect to imports is now properly handled by desugaring
two-level imports to imports of instances.
A number of small refactorings are included here as well, but most of
them are in accordance with the changes to `wasmparser` and the updated
binary format for module linking.
This commit is intended to do almost everything necessary for processing
the alias section of module linking. Most of this is internal
refactoring, the highlights being:
* Type contents are now stored separately from a `wasmtime_env::Module`.
Given that modules can freely alias types and have them used all over
the place, it seemed best to have one canonical location to type
storage which everywhere else points to (with indices). A new
`TypeTables` structure is produced during compilation which is shared
amongst all member modules in a wasm blob.
* Instantiation is heavily refactored to account for module linking. The
main gotcha here is that imports are now listed as "initializers". We
have a sort of pseudo-bytecode-interpreter which interprets the
initialization of a module. This is more complicated than just
matching imports at this point because in the module linking proposal
the module, alias, import, and instance sections may all be
interleaved. This means that imports aren't guaranteed to show up at
the beginning of the address space for modules/instances.
Otherwise most of the changes here largely fell out from these two
design points. Aliases are recorded as initializers in this scheme.
Copying around type information and/or just knowing type information
during compilation is also pretty easy since everything is just a
pointer into a `TypeTables` and we don't have to actually copy any types
themselves. Lots of various refactorings were necessary to accomodate
these changes.
Tests are hoped to cover a breadth of functionality here, but not
necessarily a depth. There's still one more piece of the module linking
proposal missing which is exporting instances/modules, which will come
in a future PR.
It's also worth nothing that there's one large TODO which isn't
implemented in this change that I plan on opening an issue for.
With module linking when a set of modules comes back from compilation
each modules has all the trampolines for the entire set of modules. This
is quite a lot of duplicate trampolines across module-linking modules.
We'll want to refactor this at some point to instead have only one set
of trampolines per set of module linking modules and have them shared
from there. I figured it was best to separate out this change, however,
since it's purely related to resource usage, and doesn't impact
non-module-linking modules at all.
cc #2094
This commit implements the interpretation necessary of the instance
section of the module linking proposal. Instantiating a module which
itself has nested instantiated instances will now instantiate the nested
instances properly. This isn't all that useful without the ability to
alias exports off the result, but we can at least observe the side
effects of instantiation through the `start` function.
cc #2094
This commit is intended to be the first of many in implementing the
module linking proposal. At this time this builds on #2059 so it
shouldn't land yet. The goal of this commit is to compile bare-bones
modules which use module linking, e.g. those with nested modules.
My hope with module linking is that almost everything in wasmtime only
needs mild refactorings to handle it. The goal is that all per-module
structures are still per-module and at the top level there's just a
`Vec` containing a bunch of modules. That's implemented currently where
`wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of
which one it's pointing to. This should enable
serialization/deserialization of any module in a nested modules
scenario, no matter how you got it.
Tons of features of the module linking proposal are missing from this
commit. For example instantiation flat out doesn't work, nor does
import/export of modules or instances. That'll be coming as future
commits, but the purpose here is to start laying groundwork in Wasmtime
for handling lots of modules in lots of places.
* Validate modules while translating
This commit is a change to cranelift-wasm to validate each function body
as it is translated. Additionally top-level module translation functions
will perform module validation. This commit builds on changes in
wasmparser to perform module validation interwtwined with parsing and
translation. This will be necessary for future wasm features such as
module linking where the type behind a function index, for example, can
be far away in another module. Additionally this also brings a nice
benefit where parsing the binary only happens once (instead of having an
up-front serial validation step) and validation can happen in parallel
for each function.
Most of the changes in this commit are plumbing to make sure everything
lines up right. The major functional change here is that module
compilation should be faster by validating in parallel (or skipping
function validation entirely in the case of a cache hit). Otherwise from
a user-facing perspective nothing should be that different.
This commit does mean that cranelift's translation now inherently
validates the input wasm module. This means that the Spidermonkey
integration of cranelift-wasm will also be validating the function as
it's being translated with cranelift. The associated PR for wasmparser
(bytecodealliance/wasmparser#62) provides the necessary tools to create
a `FuncValidator` for Gecko, but this is something I'll want careful
review for before landing!
* Read function operators until EOF
This way we can let the validator take care of any issues with
mismatched `end` instructions and/or trailing operators/bytes.
* Don't re-parse wasm for debuginfo
This commit updates debuginfo parsing to happen during the main
translation of the original wasm module. This avoid re-parsing the wasm
module twice (at least the section-level headers). Additionally this
ties debuginfo directly to a `ModuleTranslation` which makes it easier
to process debuginfo for nested modules in the upcoming module linking
proposal.
The changes here are summarized by taking the `read_debuginfo` function
and merging it with the main module translation that happens which is
driven by cranelift. Some new hooks were added to the module environment
trait to support this, but most of it was integrating with existing hooks.
* Fix tests in debug crate
This commit is intended to update wasmparser to 0.59.0. This primarily
includes bytecodealliance/wasm-tools#40 which is a large update to how
parsing and validation works. The impact on Wasmtime is pretty small at
this time, but over time I'd like to refactor the internals here to lean
more heavily on that upstream wasmparser refactoring.
For now, though, the intention is to get on the train of wasmparser's
latest `main` branch to ensure we get bug fixes and such.
As part of this update a few other crates and such were updated. This is
primarily to handle the new encoding of `ref.is_null` where the type is
not part of the instruction encoding any more.
* cranelift-wasm: replace `WasmTypesMap` with `ModuleTranslationState`
The `ModuleTranslationState` contains information decoded from the Wasm module
that must be referenced during each Wasm function's translation.
This is only for data that is maintained by `cranelift-wasm` itself, as opposed
to being maintained by the embedder. Data that is maintained by the embedder is
represented with `ModuleEnvironment`.
A `ModuleTranslationState` is returned by `translate_module`, and can then be
used when translating functions from that module.
* cranelift-wasm: rename `TranslationState` to `FuncTranslationState`
To disambiguate a bit with the new `ModuleTranslationState`.
* cranelift-wasm: Reorganize the internal `state` module into submodules
One module for the `ModuleTranslationState` and another for the
`FuncTranslationState`.
* cranelift-wasm: replace `FuncTranslator` with methods on `ModuleTranslationState`
`FuncTranslator` was two methods that always took ownership of `self`, so it
didn't really make sense as an object as opposed to two different functions, or
in this case methods on the object that actually persists for a longer time.
I think this improves ergonomics nicely.
Before:
```rust
let module_translation = translate_module(...)?;
for body in func_bodies {
let mut translator = FuncTranslator::new();
translator.translate(body, ...)?;
}
```
After:
```rust
let module_translation = translate_module(...)?;
for body in func_bodies {
module_translation.translate_func(body, ...)?;
}
```
Note that this commit does not remove `FuncTranslator`. It still exists, but is
just a wrapper over the `ModuleTranslationState` methods, and it is marked
deprecated, so that downstream users get a heads up. This should make the
transition easier.
* Revert "cranelift-wasm: replace `FuncTranslator` with methods on `ModuleTranslationState`"
This reverts commit 075f9ae933bcaae39348b61287c8f78a4009340d.
This commit introduces initial support for multi-value Wasm. Wasm blocks and
calls can now take and return an arbitrary number of values.
The encoding for multi-value blocks means that we need to keep the contents of
the "Types" section around when translating function bodies. To do this, we
introduce a `WasmTypesMap` type that maps the type indices to their parameters
and returns, construct it when parsing the "Types" section, and shepherd it
through a bunch of functions and methods when translating function bodies.
This commit adds a hook to the `ModuleEnvironment` trait to learn when a
custom section in a wasm file is read. This hook can in theory be used
to parse and handle custom sections as they appear in the wasm file
without having to re-iterate over the wasm file after cranelift has
already parsed the wasm file.
The `translate_module` function is now less strict in that it doesn't
require sections to be in a particular order, but it's figured that the
wasm file is already validated elsewhere to verify the section order.