* Fix libcall relocations for precompiled modules
This commit fixes some asserts and support for relocation libcalls in
precompiled modules loaded from disk. In doing so this reworks how mmaps
are managed for files from disk. All non-file-backed `Mmap` entries are
read/write but file-backed versions were readonly. This commit changes
this such that all `Mmap` objects, even if they're file-backed, start as
read/write. The file-based versions all use copy-on-write to preserve
the private-ness of the mapping.
This is not functionally intended to change anything. Instead this
should have some more memory writable after a module is loaded but the
text section, for example, is still left as read/execute when loading is
finished. Additionally this makes modules compiled in memory more
consistent with modules loaded from disk.
* Update a comment
* Force images to become readonly during publish
This marks compiled images as entirely readonly during the
`CodeMemory::publish` step which happens just before the text section
becomes executable. This ensures that all images, no matter where they
come from, are guaranteed frozen before they start executing.
* Resolve libcall relocations for older CPUs
Long ago Wasmtime used to have logic for resolving relocations
post-compilation for libcalls which I ended up removing during
refactorings last year. As #5563 points out, however, it's possible to
get Wasmtime to panic by disabling SSE features which forces Cranelift
to use libcalls for some floating-point operations instead. Note that
this also requires disabling SIMD because SIMD support has a baseline of
SSE 4.2.
This commit pulls back the old implementations of various libcalls and
reimplements logic necessary to have them work on CPUs without SSE 4.2
Closes#5563
* Fix log message in `wast` support
* Fix offset listed in relocations
Be sure to factor in the offset of the function itself
* Review comments
This commit fixes a bug with components by changing how DWARF
information from a wasm binary is copied over to the final compiled
artifact. Note that this is not the Wasmtime-generated DWARF but rather
the native wasm DWARF itself used in backtraces.
Previously the wasm dwarf was inserted into sections `.*.wasm` where `*`
was `debug_info`, `debug_str`, etc -- one per `gimli::SectionId` as
found in the original wasm module. This does not work with components,
however, where modules did not correctly separate their debug
information into separate sections or otherwise disambiguate. The fix in
this commit is to instead smash all the debug information together into
one large section and store offsets into that giant section. This is
similar to the `name`-section scraping or the trap metadata section
where one section contains all the data for all the modules in a component.
This simplifies the object file parsing by only looking for one section
name and doesn't add all that much complexity to serializing and looking
up dwarf information as well.
* Pull `Module` out of `ModuleTextBuilder`
This commit is the first in what will likely be a number towards
preparing for serializing a compiled component to bytes, a precompiled
artifact. To that end my rough plan is to merge all of the compiled
artifacts for a component into one large object file instead of having
lots of separate object files and lots of separate mmaps to manage. To
that end I plan on eventually using `ModuleTextBuilder` to build one
large text section for all core wasm modules and trampolines, meaning
that `ModuleTextBuilder` is no longer specific to one module. I've
extracted out functionality such as function name calculation as well as
relocation resolving (now a closure passed in) in preparation for this.
For now this just keeps tests passing, and the trajectory for this
should become more clear over the following commits.
* Remove component-specific object emission
This commit removes the `ComponentCompiler::emit_obj` function in favor
of `Compiler::emit_obj`, now renamed `append_code`. This involved
significantly refactoring code emission to take a flat list of functions
into `append_code` and the caller is responsible for weaving together
various "families" of functions and un-weaving them afterwards.
* Consolidate ELF parsing in `CodeMemory`
This commit moves the ELF file parsing and section iteration from
`CompiledModule` into `CodeMemory` so one location keeps track of
section ranges and such. This is in preparation for sharing much of this
code with components which needs all the same sections to get tracked
but won't be using `CompiledModule`. A small side benefit from this is
that the section parsing done in `CodeMemory` and `CompiledModule` is no
longer duplicated.
* Remove separately tracked traps in components
Previously components would generate an "always trapping" function
and the metadata around which pc was allowed to trap was handled
manually for components. With recent refactorings the Wasmtime-standard
trap section in object files is now being generated for components as
well which means that can be reused instead of custom-tracking this
metadata. This commit removes the manual tracking for the `always_trap`
functions and plumbs the necessary bits around to make components look
more like modules.
* Remove a now-unnecessary `Arc` in `Module`
Not expected to have any measurable impact on performance, but
complexity-wise this should make it a bit easier to understand the
internals since there's no longer any need to store this somewhere else
than its owner's location.
* Merge compilation artifacts of components
This commit is a large refactoring of the component compilation process
to produce a single artifact instead of multiple binary artifacts. The
core wasm compilation process is refactored as well to share as much
code as necessary with the component compilation process.
This method of representing a compiled component necessitated a few
medium-sized changes internally within Wasmtime:
* A new data structure was created, `CodeObject`, which represents
metadata about a single compiled artifact. This is then stored as an
`Arc` within a component and a module. For `Module` this is always
uniquely owned and represents a shuffling around of data from one
owner to another. For a `Component`, however, this is shared amongst
all loaded modules and the top-level component.
* The "module registry" which is used for symbolicating backtraces and
for trap information has been updated to account for a single region
of loaded code holding possibly multiple modules. This involved adding
a second-level `BTreeMap` for now. This will likely slow down
instantiation slightly but if it poses an issue in the future this
should be able to be represented with a more clever data structure.
This commit additionally solves a number of longstanding issues with
components such as compiling only one host-to-wasm trampoline per
signature instead of possibly once-per-module. Additionally the
`SignatureCollection` registration now happens once-per-component
instead of once-per-module-within-a-component.
* Fix compile errors from prior commits
* Support AOT-compiling components
This commit adds support for AOT-compiled components in the same manner
as `Module`, specifically adding:
* `Engine::precompile_component`
* `Component::serialize`
* `Component::deserialize`
* `Component::deserialize_file`
Internally the support for components looks quite similar to `Module`.
All the prior commits to this made adding the support here
(unsurprisingly) easy. Components are represented as a single object
file as are modules, and the functions for each module are all piled
into the same object file next to each other (as are areas such as data
sections). Support was also added here to quickly differentiate compiled
components vs compiled modules via the `e_flags` field in the ELF
header.
* Prevent serializing exported modules on components
The current representation of a module within a component means that the
implementation of `Module::serialize` will not work if the module is
exported from a component. The reason for this is that `serialize`
doesn't actually do anything and simply returns the underlying mmap as a
list of bytes. The mmap, however, has `.wasmtime.info` describing
component metadata as opposed to this module's metadata. While rewriting
this section could be implemented it's not so easy to do so and is
otherwise seen as not super important of a feature right now anyway.
* Fix windows build
* Fix an unused function warning
* Update crates/environ/src/compilation.rs
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Refactor metadata storage in AOT artifacts
This commit is a reorganization of how metadata is stored in Wasmtime's
compiled artifacts. Currently Wasmtime's ELF artifacts have data
appended after them to contain metadata about the `Engine` as well as
type information for the module itself. This extra data at the end of
the file is ignored by ELF-related utilities generally and is assembled
during the module serialization process.
In working on AOT-compiling components, though, I've discovered a number
of issues with this:
* Primarily it's possible to mistakenly change an artifact if it's
deserialized and then serialized again. This issue is probably
theoretical but the deserialized artifact records the `Engine`
configuration at time of creation but when re-serializing that it
serializes the current `Engine` state, not the original `Engine`
state.
* Additionally the serialization strategy here is tightly coupled to
`Module` and its serialization format. While this makes sense it is
not conducive for future refactorings to use a similar serialization
format for components. The engine metadata, for example, does not
necessarily need to be tied up with type information.
* The storage for this extra metadata is a bit wonky by shoving it at
the end of the ELF file. The original reason for this was to have a
compiled artifact be multiple objects concatenated with each other to
support serializing module-linking-using modules. Module linking is no
longer a thing and I have since decided that for the component model
all compilation artifacts will go into one object file to assist
debugability. This means that the extra stick-it-at-the-end is no
longer necessary.
To solve these issues this commit splits up the
`module/serialization.rs` file in two, mostly moving the logic to
`engine/serialization.rs`. The engine serialization logic now handles
everything related to `Engine` compatibility such as targets, compiler
flags, wasm features, etc. The module serialization logic is now
exclusively interested in type information.
The engine metadata and serialized type information additionally live in
sections of the final file now instead of at the end. This means that
there are three primary `bincode`-encoded sections that are parsed on
deserializing a file:
1. The `Engine`-specific metadata. This will be the same for both
modules and components.
2. The `CompiledModuleInfo` structure. For core wasm there's just one of
these but for the component model there will be multiple, one per
core wasm module.
3. The type information. For core wasm this is a `ModuleTypes` but for a
component this will be a `ComponentTypes`.
No true functional change is expected from this commit. Binary artifacts
might get inflated by a small handful of bytes due to using ELF sections
to represent this now.
A related change I made during this commit as well was the plumbing of
the `is_branch_protection_enabled` flag. This is technically
`Engine`-level metadata but I didn't want to plumb it all over the place
as was done now, so instead a new section was added to the final binary
just for this bti information. This means that it no longer needs to be
a parameter to `CodeMemory::publish` and additionally is more amenable
to a `Component`-is-just-one-object world where no single module owns
this piece of metadata.
* Exclude some functions in a cranelift-less build
* cranelift: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* wasmtime: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* cranelift: Add MemoryUse flag to JIT Memory Manager
This allows us to keep the icache flushing code self-contained and not leak implementation details.
This also changes the windows icache flushing code to only flush pages that were previously unflushed.
* Add jit-icache-coherence crate
* cranelift: Use `jit-icache-coherence`
* wasmtime: Use `jit-icache-coherence`
* jit-icache-coherence: Make rustix feature additive
Mutually exclusive features cause issues.
* wasmtime: Remove rustix from wasmtime-jit
We now use it via jit-icache-coherence
* Rename wasmtime-jit-icache-coherency crate
* Use cfg-if in wasmtime-jit-icache-coherency crate
* Use inline instead of inline(always)
* Add unsafe marker to clear_cache
* Conditionally compile all rustix operations
membarrier does not exist on MacOS
* Publish `wasmtime-jit-icache-coherence`
* Remove explicit windows check
This is implied by the target_os = "windows" above
* cranelift: Remove len != 0 check
This is redundant as it is done in non_protected_allocations_iter
* Comment cleanups
Thanks @akirilov-arm!
* Make clear_cache safe
* Rename pipeline_flush to pipeline_flush_mt
* Revert "Make clear_cache safe"
This reverts commit 21165d81c9030ed9b291a1021a367214d2942c90.
* More docs!
* Fix pipeline_flush reference on clear_cache
* Update more docs!
* Move pipeline flush after `mprotect` calls
Technically the `clear_cache` operation is a lie in AArch64, so move the pipeline flush after the `mprotect` calls so that it benefits from the implicit cache cleaning done by it.
* wasmtime: Remove rustix backend from icache crate
* wasmtime: Use libc for macos
* wasmtime: Flush icache on all arch's for windows
* wasmtime: Add flags to membarrier call
* Initial forward-edge CFI implementation
Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.
Copyright (c) 2022, Arm Limited.
* Refactor `from_artifacts` to avoid second `make_executable` (#1)
This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.
* Address the code review feedback
Copyright (c) 2022, Arm Limited.
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
* fuzz: Fuzz padding between compiled functions
This commit hooks up the custom
`wasmtime_linkopt_padding_between_functions` configuration option to the
cranelift compiler into the fuzz configuration, enabling us to ensure
that randomly inserting a moderate amount of padding between functions
shouldn't tamper with any results.
* fuzz: Fuzz the `Config::generate_address_map` option
This commit adds fuzz configuration where `generate_address_map` is
either enabled or disabled, unlike how it's always enabled for fuzzing
today.
* Remove unnecessary handling of relocations
This commit removes a number of bits and pieces all related to handling
relocations in JIT code generated by Wasmtime. None of this is necessary
nowadays that the "old backend" has been removed (quite some time ago)
and relocations are no longer expected to be in the JIT code at all.
Additionally with the minimum x86_64 features required to run wasm code
it should be expected that no libcalls are required either for
Wasmtime-based JIT code.
* Skip memfd creation with precompiled modules
This commit updates the memfd support internally to not actually use a
memfd if a compiled module originally came from disk via the
`wasmtime::Module::deserialize_file` API. In this situation we already
have a file descriptor open and there's no need to copy a module's heap
image to a new file descriptor.
To facilitate a new source of `mmap` the currently-memfd-specific-logic
of creating a heap image is generalized to a new form of
`MemoryInitialization` which is attempted for all modules at
module-compile-time. This means that the serialized artifact to disk
will have the memory image in its entirety waiting for us. Furthermore
the memory image is ensured to be padded and aligned carefully to the
target system's page size, notably meaning that the data section in the
final object file is page-aligned and the size of the data section is
also page aligned.
This means that when a precompiled module is mapped from disk we can
reuse the underlying `File` to mmap all initial memory images. This
means that the offset-within-the-memory-mapped-file can differ for
memfd-vs-not, but that's just another piece of state to track in the
memfd implementation.
In the limit this waters down the term "memfd" for this technique of
quickly initializing memory because we no longer use memfd
unconditionally (only when the backing file isn't available).
This does however open up an avenue in the future to porting this
support to other OSes because while `memfd_create` is Linux-specific
both macOS and Windows support mapping a file with copy-on-write. This
porting isn't done in this PR and is left for a future refactoring.
Closes#3758
* Enable "memfd" support on all unix systems
Cordon off the Linux-specific bits and enable the memfd support to
compile and run on platforms like macOS which have a Linux-like `mmap`.
This only works if a module is mapped from a precompiled module file on
disk, but that's better than not supporting it at all!
* Fix linux compile
* Use `Arc<File>` instead of `MmapVecFileBacking`
* Use a named struct instead of mysterious tuples
* Comment about unsafety in `Module::deserialize_file`
* Fix tests
* Fix uffd compile
* Always align data segments
No need to have conditional alignment since their sizes are all aligned
anyway
* Update comment in build.rs
* Use rustix, not `region`
* Fix some confusing logic/names around memory indexes
These functions all work with memory indexes, not specifically defined
memory indexes.
`ptr::cast` has the advantage of being unable to silently cast
`*const T` to `*mut T`. This turned up several places that were
performing such casts, which this PR also fixes.
This pulls in a fix for Android, where Android's seccomp policy on older
versions is to make `openat2` irrecoverably crash the process, so we have
to do a version check up front rather than relying on `ENOSYS` to
determine if `openat2` is supported.
And it pulls in the fix for the link errors when multiple versions of
rsix/rustix are linked in.
And it has updates for two crate renamings: rsix has been renamed to
rustix, and unsafe-io has been renamed to io-extras.
Some platforms such as AArch64 Linux support different memory page
sizes, so we need to be conservative when choosing the code section
alignment (which is equal to the page size) by using the maximum.
Copyright (c) 2021, Arm Limited.
* Add a `Module::deserialize_file` method
This commit adds a new method to the `wasmtime::Module` type,
`deserialize_file`. This is intended to be the same as the `deserialize`
method except for the serialized module is present as an on-disk file.
This enables Wasmtime to internally use `mmap` to avoid copying bytes
around and generally makes loading a module much faster.
A C API is added in this commit as well for various bindings to use this
accelerated path now as well. Another option perhaps for a Rust-based
API is to have an API taking a `File` itself to allow for a custom file
descriptor in one way or another, but for now that's left for a possible
future refactoring if we find a use case.
* Fix compat with main - handle readdonly mmap
* wip
* Try to fix Windows support
* Don't copy executable code into a `CodeMemory`
This commit moves a copy from compiled artifacts into a `CodeMemory`. In
general this commit drastically changes the meaning of a `CodeMemory`.
Previously it was an iteratively-pushed-on structure that would
accumulate executable code over time. Afterwards, however, it's a
manager for an `MmapVec` which updates the permissions on text section
to ensure that the pages are executable.
By taking ownership of an `MmapVec` within a `CodeMemory` there's no
need to copy any data around, which means that the `.text` section in
the ELF image produced by Wasmtime is usable as-is after placement in
memory and relocations have been resolved. This moves Wasmtime one step
closer to being able to directly use a module after it's `mmap`'d into
memory, optimizing when a module is loaded.
* Fix windows section alignment
* Review comments
* Remove some allocations in `CodeMemory`
This commit removes the `FinishedFunctions` type as well as allocations
associated with trampolines when allocating inside of a `CodeMemory`.
The main goal of this commit is to improve the time spent in
`CodeMemory` where currently today a good portion of time is spent
simply parsing symbol names and trying to extract function indices from
them. Instead this commit implements a new strategy (different from #3236)
where compilation records offset/length information for all
functions/trampolines so this doesn't need to be re-learned from the
object file later.
A consequence of this commit is that this offset information will be
decoded/encoded through `bincode` unconditionally, but we can also
optimize that later if necessary as well.
Internally this involved quite a bit of refactoring since the previous
map for `FinishedFunctions` was relatively heavily relied upon.
* comments
* Move wasm data/debuginfo into the ELF compilation image
This commit moves existing allocations of `Box<[u8]>` stored separately
from compilation's final ELF image into the ELF image itself. The goal
of this commit is to reduce the amount of data which `bincode` will need
to process in the future. DWARF debugging information and wasm data
segments can be quite large, and they're relatively rarely read, so
there's typically no need to copy them around. Instead by moving them
into the ELF image this opens up the opportunity in the future to
eliminate copies and use data directly as-found in the image itself.
For information accessed possibly-multiple times, such as the wasm data
ranges, the indexes of the data within the ELF image are computed when
a `CompiledModule` is created. These indexes are then used to directly
index into the image without having to root around in the ELF file each
time they're accessed.
One other change located here is that the symbolication context
previously cloned the debug information into it to adhere to the
`'static` lifetime safely, but this isn't actually ever used in
`wasmtime` right now so the unsafety around this has been removed and
instead borrowed data is returned (no more clones, yay!).
* Fix lightbeam
* Move `CompiledFunction` into wasmtime-cranelift
This commit moves the `wasmtime_environ::CompiledFunction` type into the
`wasmtime-cranelift` crate. This type has lots of Cranelift-specific
pieces of compilation and doesn't need to be generated by all Wasmtime
compilers. This replaces the usage in the `Compiler` trait with a
`Box<Any>` type that each compiler can select. Each compiler must still
produce a `FunctionInfo`, however, which is shared information we'll
deserialize for each module.
The `wasmtime-debug` crate is also folded into the `wasmtime-cranelift`
crate as a result of this commit. One possibility was to move the
`CompiledFunction` commit into its own crate and have `wasmtime-debug`
depend on that, but since `wasmtime-debug` is Cranelift-specific at this
time it didn't seem like it was too too necessary to keep it separate.
If `wasmtime-debug` supports other backends in the future we can
recreate a new crate, perhaps with it refactored to not depend on
Cranelift.
* Move wasmtime_environ::reference_type
This now belongs in wasmtime-cranelift and nowhere else
* Remove `Type` reexport in wasmtime-environ
One less dependency on `cranelift-codegen`!
* Remove `types` reexport from `wasmtime-environ`
Less cranelift!
* Remove `SourceLoc` from wasmtime-environ
Change the `srcloc`, `start_srcloc`, and `end_srcloc` fields to a custom
`FilePos` type instead of `ir::SourceLoc`. These are only used in a few
places so there's not much to lose from an extra abstraction for these
leaf use cases outside of cranelift.
* Remove wasmtime-environ's dep on cranelift's `StackMap`
This commit "clones" the `StackMap` data structure in to
`wasmtime-environ` to have an independent representation that that
chosen by Cranelift. This allows Wasmtime to decouple this runtime
dependency of stack map information and let the two evolve
independently, if necessary.
An alternative would be to refactor cranelift's implementation into a
separate crate and have wasmtime depend on that but it seemed a bit like
overkill to do so and easier to clone just a few lines for this.
* Define code offsets in wasmtime-environ with `u32`
Don't use Cranelift's `binemit::CodeOffset` alias to define this field
type since the `wasmtime-environ` crate will be losing the
`cranelift-codegen` dependency soon.
* Commit to using `cranelift-entity` in Wasmtime
This commit removes the reexport of `cranelift-entity` from the
`wasmtime-environ` crate and instead directly depends on the
`cranelift-entity` crate in all referencing crates. The original reason
for the reexport was to make cranelift version bumps easier since it's
less versions to change, but nowadays we have a script to do that.
Otherwise this encourages crates to use whatever they want from
`cranelift-entity` since we'll always depend on the whole crate.
It's expected that the `cranelift-entity` crate will continue to be a
lean crate in dependencies and suitable for use at both runtime and
compile time. Consequently there's no need to avoid its usage in
Wasmtime at runtime, since "remove Cranelift at compile time" is
primarily about the `cranelift-codegen` crate.
* Remove most uses of `cranelift-codegen` in `wasmtime-environ`
There's only one final use remaining, which is the reexport of
`TrapCode`, which will get handled later.
* Limit the glob-reexport of `cranelift_wasm`
This commit removes the glob reexport of `cranelift-wasm` from the
`wasmtime-environ` crate. This is intended to explicitly define what
we're reexporting and is a transitionary step to curtail the amount of
dependencies taken on `cranelift-wasm` throughout the codebase. For
example some functions used by debuginfo mapping are better imported
directly from the crate since they're Cranelift-specific. Note that
this is intended to be a temporary state affairs, soon this reexport
will be gone entirely.
Additionally this commit reduces imports from `cranelift_wasm` and also
primarily imports from `crate::wasm` within `wasmtime-environ` to get a
better sense of what's imported from where and what will need to be
shared.
* Extract types from cranelift-wasm to cranelift-wasm-types
This commit creates a new crate called `cranelift-wasm-types` and
extracts type definitions from the `cranelift-wasm` crate into this new
crate. The purpose of this crate is to be a shared definition of wasm
types that can be shared both by compilers (like Cranelift) as well as
wasm runtimes (e.g. Wasmtime). This new `cranelift-wasm-types` crate
doesn't depend on `cranelift-codegen` and is the final step in severing
the unconditional dependency from Wasmtime to `cranelift-codegen`.
The final refactoring in this commit is to then reexport this crate from
`wasmtime-environ`, delete the `cranelift-codegen` dependency, and then
update all `use` paths to point to these new types.
The main change of substance here is that the `TrapCode` enum is
mirrored from Cranelift into this `cranelift-wasm-types` crate. While
this unfortunately results in three definitions (one more which is
non-exhaustive in Wasmtime itself) it's hopefully not too onerous and
ideally something we can patch up in the future.
* Get lightbeam compiling
* Remove unnecessary dependency
* Fix compile with uffd
* Update publish script
* Fix more uffd tests
* Rename cranelift-wasm-types to wasmtime-types
This reflects the purpose a bit more where it's types specifically
intended for Wasmtime and its support.
* Fix publish script
* Reimplement how unwind information is stored
This commit is a major refactoring of how unwind information is stored
after compilation of a function has finished. Previously we would store
the raw `UnwindInfo` as a result of compilation and this would get
serialized/deserialized alongside the rest of the ELF object that
compilation creates. Whenever functions were registered with
`CodeMemory` this would also result in registering unwinding information
dynamically at runtime, which in the case of Unix, for example, would
dynamically created FDE/CIE entries on-the-fly.
Eventually I'd like to support compiling Wasmtime without Cranelift, but
this means that `UnwindInfo` wouldn't be easily available to decode into
and create unwinding information from. To solve this I've changed the
ELF object created to have the unwinding information encoded into it
ahead-of-time so loading code into memory no longer needs to create
unwinding tables. This change has two different implementations for
Windows/Unix:
* On Windows the implementation was much easier. The unwinding
information on Windows is already stored after the function itself in
the text section. This was actually slightly duplicated in object
building and in code memory allocation. Now the object building
continues to do the same, recording unwinding information after
functions, and code memory no longer manually tracks this.
Additionally Wasmtime will emit a special custom section in the object
file with unwinding information which is the list of
`RUNTIME_FUNCTION` structures that `RtlAddFunctionTable` expects. This
means that the object file has all the information precompiled into it
and registration at runtime is simply passing a few pointers around to
the runtime.
* Unix was a little bit more difficult than Windows. Today a `.eh_frame`
section is created on-the-fly with offsets in FDEs specified as the
absolute address that functions are loaded at. This absolute
address hindered the ability to precompile the FDE into the object
file itself. I've switched how addresses are encoded, though, to using
`DW_EH_PE_pcrel` which means that FDE addresses are now specified
relative to the FDE itself. This means that we can maintain a fixed
offset between the `.eh_frame` loaded in memory and the beginning of
code memory. When doing so this enables precompiling the `.eh_frame`
section into the object file and at runtime when loading an object no
further construction of unwinding information is needed.
The overall result of this commit is that unwinding information is no
longer stored in its cranelift-data-structure form on disk. This means
that this unwinding information format is only present during
compilation, which will make it that much easier to compile out
cranelift in the future.
This commit also significantly refactors `CodeMemory` since the way
unwinding information is handled is not much different from before.
Previously `CodeMemory` was suitable for incrementally adding more and
more functions to it, but nowadays a `CodeMemory` either lives per
module (in which case all functions are known up front) or it's created
once-per-`Func::new` with two trampolines. In both cases we know all
functions up front so the functionality of incrementally adding more and
more segments is no longer needed. This commit removes the ability to
add a function-at-a-time in `CodeMemory` and instead it can now only
load objects in their entirety. A small helper function is added to
build a small object file for trampolines in `Func::new` to handle
allocation there.
Finally, this commit also folds the `wasmtime-obj` crate directly into
the `wasmtime-cranelift` crate and its builder structure to be more
amenable to this strategy of managing unwinding tables.
It is not intentional to have any real functional change as a result of
this commit. This might accelerate loading a module from cache slightly
since less work is needed to manage the unwinding information, but
that's just a side benefit from the main goal of this commit which is to
remove the dependence on cranelift unwinding information being available
at runtime.
* Remove isa reexport from wasmtime-environ
* Trim down reexports of `cranelift-codegen`
Remove everything non-essential so that only the bits which will need to
be refactored out of cranelift remain.
* Fix debug tests
* Review comments
This commit started off by deleting the `cranelift_codegen::settings`
reexport in the `wasmtime-environ` crate and then basically played
whack-a-mole until everything compiled again. The main result of this is
that the `wasmtime-*` family of crates have generally less of a
dependency on the `TargetIsa` trait and type from Cranelift. While the
dependency isn't entirely severed yet this is at least a significant
start.
This commit is intended to be largely refactorings, no functional
changes are intended here. The refactorings are:
* A `CompilerBuilder` trait has been added to `wasmtime_environ` which
server as an abstraction used to create compilers and configure them
in a uniform fashion. The `wasmtime::Config` type now uses this
instead of cranelift-specific settings. The `wasmtime-jit` crate
exports the ability to create a compiler builder from a
`CompilationStrategy`, which only works for Cranelift right now. In a
cranelift-less build of Wasmtime this is expected to return a trait
object that fails all requests to compile.
* The `Compiler` trait in the `wasmtime_environ` crate has been souped
up with a number of methods that Wasmtime and other crates needed.
* The `wasmtime-debug` crate is now moved entirely behind the
`wasmtime-cranelift` crate.
* The `wasmtime-cranelift` crate is now only depended on by the
`wasmtime-jit` crate.
* Wasm types in `cranelift-wasm` no longer contain their IR type,
instead they only contain the `WasmType`. This is required to get
everything to align correctly but will also be required in a future
refactoring where the types used by `cranelift-wasm` will be extracted
to a separate crate.
* I moved around a fair bit of code in `wasmtime-cranelift`.
* Some gdb-specific jit-specific code has moved from `wasmtime-debug` to
`wasmtime-jit`.
* Move all trampoline compilation to `wasmtime-cranelift`
This commit moves compilation of all the trampolines used in wasmtime
behind the `Compiler` trait object to live in `wasmtime-cranelift`. The
long-term goal of this is to enable depending on cranelift *only* from
the `wasmtime-cranelift` crate, so by moving these dependencies we
should make that a little more flexible.
* Fix windows build
* Remove `once-cell` dependency.
* Remove function address `BTreeMap` from `CompiledModule` in favor of binary
searching finished functions directly.
* Use `with_capacity` when populating `CompiledModule` finished functions and
trampolines.
This commit adds a `compile` command to the Wasmtime CLI.
The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.
With the `all-arch` feature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).
The `Module::compile` method has been added to perform AOT compilation.
A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.
A simple example of using the `wasmtime compile` command:
```text
$ wasmtime compile input.wasm
$ wasmtime input.cwasm
```
* More use of `anyhow`.
* Change `make_accessible` into `protect_linear_memory` to better demonstrate
what it is used for; this will make the uffd implementation make a little
more sense.
* Remove `create_memory_map` in favor of just creating the `Mmap` instances in
the pooling allocator. This also removes the need for `MAP_NORESERVE` in the
uffd implementation.
* Moar comments.
* Remove `BasePointerIterator` in favor of `impl Iterator`.
* The uffd implementation now only monitors linear memory pages and will only
receive faults on pages that could potentially be accessible and never on a
statically known guard page.
* Stop allocating memory or table pools if the maximum limit of the memory or
table is 0.
I don't think this has happened in awhile but I've run a `cargo update`
as well as trimming some of the duplicate/older dependencies in
`Cargo.lock` by updating some of our immediate dependencies as well.
* Refactor where results of compilation are stored
This commit refactors the internals of compilation in Wasmtime to change
where results of individual function compilation are stored. Previously
compilation resulted in many maps being returned, and compilation
results generally held all these maps together. This commit instead
switches this to have all metadata stored in a `CompiledFunction`
instead of having a separate map for each item that can be stored.
The motivation for this is primarily to help out with future
module-linking-related PRs. What exactly "module level" is depends on
how we interpret modules and how many modules are in play, so it's a bit
easier for operations in wasmtime to work at the function level where
possible. This means that we don't have to pass around multiple
different maps and a function index, but instead just one map or just
one entry representing a compiled function.
Additionally this change updates where the parallelism of compilation
happens, pushing it into `wasmtime-jit` instead of `wasmtime-environ`.
This is another goal where `wasmtime-jit` will have more knowledge about
module-level pieces with module linking in play. User-facing-wise this
should be the same in terms of parallel compilation, though.
The ultimate goal of this refactoring is to make it easier for the
results of compilation to actually be a set of wasm modules. This means
we won't be able to have a map-per-metadata where the primary key is the
function index, because there will be many modules within one "object
file".
* Don't clear out fields, just don't store them
Persist a smaller set of fields in `CompilationArtifacts` instead of
trying to clear fields out and dynamically not accessing them.
- Create the ELF image from Compilation
- Create CodeMemory from the ELF image
- Link using ELF image
- Remove creation of GDB JIT images from crates/debug
- Move make_trampoline from compiler.rs
* Refactor how relocs are stored and handled
* refactor CompiledModule::instantiate and link_module
* Refactor DWARF creation: split generation and serialization
* Separate DWARF data transform from instantiation
* rm LinkContext
* Moves CodeMemory, VMInterrupts and SignatureRegistry from Compiler
* CompiledModule holds CodeMemory and GdbJitImageRegistration
* Store keeps track of its JIT code
* Makes "jit_int.rs" stuff Send+Sync
* Adds the threads example.
This commit fixes an issue in Wasmtime where Wasmtime would accidentally
"handle" non-wasm segfaults while executing host imports of wasm
modules. If a host import segfaulted then Wasmtime would recognize that
wasm code is on the stack, so it'd longjmp out of the wasm code. This
papers over real bugs though in host code and erroneously classified
segfaults as wasm traps.
The fix here was to add a check to our wasm signal handler for if the
faulting address falls in JIT code itself. Actually threading through
all the right information for that check to happen is a bit tricky,
though, so this involved some refactoring:
* A closure parameter to `catch_traps` was added. This closure is
responsible for classifying addresses as whether or not they fall in
JIT code. Anything returning `false` means that the trap won't get
handled and we'll forward to the next signal handler.
* To avoid passing tons of context all over the place, the start
function is now no longer automatically invoked by `InstanceHandle`.
This avoids the need for passing all sorts of trap-handling contextual
information like the maximum stack size and "is this a jit address"
closure. Instead creators of `InstanceHandle` (like wasmtime) are now
responsible for invoking the start function.
* To avoid excessive use of `transmute` with lifetimes since the
traphandler state now has a lifetime the per-instance custom signal
handler is now replaced with a per-store custom signal handler. I'm
not entirely certain the purpose of the custom signal handler, though,
so I'd look for feedback on this part.
A new test has been added which ensures that if a host function
segfaults we don't accidentally try to handle it, and instead we
correctly report the segfault.
This commit makes the following changes to unwind information generation in
Cranelift:
* Remove frame layout change implementation in favor of processing the prologue
and epilogue instructions when unwind information is requested. This also
means this work is no longer performed for Windows, which didn't utilize it.
It also helps simplify the prologue and epilogue generation code.
* Remove the unwind sink implementation that required each unwind information
to be represented in final form. For FDEs, this meant writing a
complete frame table per function, which wastes 20 bytes or so for each
function with duplicate CIEs. This also enables Cranelift users to collect the
unwind information and write it as a single frame table.
* For System V calling convention, the unwind information is no longer stored
in code memory (it's only a requirement for Windows ABI to do so). This allows
for more compact code memory for modules with a lot of functions.
* Deletes some duplicate code relating to frame table generation. Users can
now simply use gimli to create a frame table from each function's unwind
information.
Fixes#1181.
* Enable jitdump profiling support by default
This the result of some of the investigation I was doing for #1017. I've
done a number of refactorings here which culminated in a number of
changes that all amount to what I think should result in jitdump support being
enabled by default:
* Pass in a list of finished functions instead of just a range to
ensure that we're emitting jit dump data for a specific module rather
than a whole `CodeMemory` which may have other modules.
* Define `ProfilingStrategy` in the `wasmtime` crate to have everything
locally-defined
* Add support to the C API to enable profiling
* Documentation added for profiling with jitdump to the book
* Split out supported/unsupported files in `jitdump.rs` to avoid having
lots of `#[cfg]`.
* Make dependencies optional that are only used for `jitdump`.
* Move initialization up-front to `JitDumpAgent::new()` instead of
deferring it to the first module.
* Pass around `Arc<dyn ProfilingAgent>` instead of
`Option<Arc<Mutex<Box<dyn ProfilingAgent>>>>`
The `jitdump` Cargo feature is now enabled by default which means that
our published binaries, C API artifacts, and crates will support
profiling at runtime by default. The support I don't think is fully
fleshed out and working but I think it's probably in a good enough spot
we can get users playing around with it!
* Refactor wasmtime_runtime::Export
Instead of an enumeration with variants that have data fields have an
enumeration where each variant has a struct, and each struct has the
data fields. This allows us to store the structs in the `wasmtime` API
and avoid lots of `panic!` calls and various extraneous matches.
* Pre-generate trampoline functions
The `wasmtime` crate supports calling arbitrary function signatures in
wasm code, and to do this it generates "trampoline functions" which have
a known ABI that then internally convert to a particular signature's ABI
and call it. These trampoline functions are currently generated
on-the-fly and are cached in the global `Store` structure. This,
however, is suboptimal for a few reasons:
* Due to how code memory is managed each trampoline resides in its own
64kb allocation of memory. This means if you have N trampolines you're
using N * 64kb of memory, which is quite a lot of overhead!
* Trampolines are never free'd, even if the referencing module goes
away. This is similar to #925.
* Trampolines are a source of shared state which prevents `Store` from
being easily thread safe.
This commit refactors how trampolines are managed inside of the
`wasmtime` crate and jit/runtime internals. All trampolines are now
allocated in the same pass of `CodeMemory` that the main module is
allocated into. A trampoline is generated per-signature in a module as
well, instead of per-function. This cache of trampolines is stored
directly inside of an `Instance`. Trampolines are stored based on
`VMSharedSignatureIndex` so they can be looked up from the internals of
the `ExportFunction` value.
The `Func` API has been updated with various bits and pieces to ensure
the right trampolines are registered in the right places. Overall this
should ensure that all trampolines necessary are generated up-front
rather than lazily. This allows us to remove the trampoline cache from
the `Compiler` type, and move one step closer to making `Compiler`
threadsafe for usage across multiple threads.
Note that as one small caveat the `Func::wrap*` family of functions
don't need to generate a trampoline at runtime, they actually generate
the trampoline at compile time which gets passed in.
Also in addition to shuffling a lot of code around this fixes one minor
bug found in `code_memory.rs`, where `self.position` was loaded before
allocation, but the allocation may push a new chunk which would cause
`self.position` to be zero instead.
* Pass the `SignatureRegistry` as an argument to where it's needed.
This avoids the need for storing it in an `Arc`.
* Ignore tramoplines for functions with lots of arguments
Co-authored-by: Dan Gohman <sunfish@mozilla.com>
Patch adds support for the perf jitdump file specification.
With this patch it should be possible to see profile data for code
generated and maped at runtime. Specifically the patch adds support
for the JIT_CODE_LOAD and the JIT_DEBUG_INFO record as described in
the specification. Dumping jitfiles is enabled with the --jitdump
flag. When the -g flag is also used there is an attempt to dump file
and line number information where this option would be most useful
when the WASM file already includes DWARF debug information.
The generation of the jitdump files has been tested on only a few wasm
files. This patch is expected to be useful/serviceable where currently
there is no means for jit profiling, but future patches may benefit
line mapping and add support for additional jitdump record types.
Usage Example:
Record
sudo perf record -k 1 -e instructions:u target/debug/wasmtime -g
--jitdump test.wasm
Combine
sudo perf inject -v -j -i perf.data -o perf.jit.data
Report
sudo perf report -i perf.jit.data -F+period,srcline
* Update `CodeMemory` to be `Send + Sync`
This commit updates the `CodeMemory` type in wasmtime to be both `Send`
and `Sync` by updating the implementation of `Mmap` to not store raw
pointers. This avoids the need for an `unsafe impl` and leaves the
unsafety as it is currently.
* Run rustfmt
* Rename `offset` to `ptr`
* Remove the need for `HostRef<Module>`
This commit continues previous work and also #708 by removing the need
to use `HostRef<Module>` in the API of the `wasmtime` crate. The API
changes performed here are:
* The `Module` type is now itself internally reference counted.
* The `Module::store` function now returns the `Store` that was used to
create a `Module`
* Documentation for `Module` and its methods have been expanded.
* Fix compliation of test programs harness
* Fix the python extension
* Update `CodeMemory` to be `Send + Sync`
This commit updates the `CodeMemory` type in wasmtime to be both `Send`
and `Sync` by updating the implementation of `Mmap` to not store raw
pointers. This avoids the need for an `unsafe impl` and leaves the
unsafety as it is currently.
* Fix a typo
* Migrate back to `std::` stylistically
This commit moves away from idioms such as `alloc::` and `core::` as
imports of standard data structures and types. Instead it migrates all
crates to uniformly use `std::` for importing standard data structures
and types. This also removes the `std` and `core` features from all
crates to and removes any conditional checking for `feature = "std"`
All of this support was previously added in #407 in an effort to make
wasmtime/cranelift "`no_std` compatible". Unfortunately though this
change comes at a cost:
* The usage of `alloc` and `core` isn't idiomatic. Especially trying to
dual between types like `HashMap` from `std` as well as from
`hashbrown` causes imports to be surprising in some cases.
* Unfortunately there was no CI check that crates were `no_std`, so none
of them actually were. Many crates still imported from `std` or
depended on crates that used `std`.
It's important to note, however, that **this does not mean that wasmtime
will not run in embedded environments**. The style of the code today and
idioms aren't ready in Rust to support this degree of multiplexing and
makes it somewhat difficult to keep up with the style of `wasmtime`.
Instead it's intended that embedded runtime support will be added as
necessary. Currently only `std` is necessary to build `wasmtime`, and
platforms that natively need to execute `wasmtime` will need to use a
Rust target that supports `std`. Note though that not all of `std` needs
to be supported, but instead much of it could be configured off to
return errors, and `wasmtime` would be configured to gracefully handle
errors.
The goal of this PR is to move `wasmtime` back to idiomatic usage of
features/`std`/imports/etc and help development in the short-term.
Long-term when platform concerns arise (if any) they can be addressed by
moving back to `no_std` crates (but fixing the issues mentioned above)
or ensuring that the target in Rust has `std` available.
* Start filling out platform support doc