* Wasmtime+Cranelift: strip out some dead x86-32 code.
I was recently pointed to fastly/Viceroy#200 where it seems some folks
are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the
failures may not be loud enough. I've tried to reproduce this
cross-compiling to i686-pc-windows-gnu from Linux and hit build failures
(as expected) in several places. Nevertheless, while trying to discern
what others may be attempting, I noticed some dead x86-32-specific code
in our repo, and figured it would be a good idea to clean this up.
Otherwise, it (i) sends some mixed messages -- "hey look, this codebase
does support x86-32" -- and (ii) keeps untested code around, which is
generally not great.
This PR removes x86-32-specific cases in traphandlers and unwind code,
and Cranelift's native feature detection. It adds helpful compile-error
messages in a few cases. If we ever support x86-32 (contributors
welcome! The big missing piece is Cranelift support; see #1980), these
compile errors and git history should be enough to recover any knowledge
we are now encoding in the source.
I left the x86-32 support in `wasmtime-fiber` alone because that seems
like a bit of a special case -- foundation library, separate from the
rest of Wasmtime, with specific care to provide a (presumably working)
full 32-bit version.
* Remove some extraneous compile_error!s, already covered by others.
* Pull `Module` out of `ModuleTextBuilder`
This commit is the first in what will likely be a number towards
preparing for serializing a compiled component to bytes, a precompiled
artifact. To that end my rough plan is to merge all of the compiled
artifacts for a component into one large object file instead of having
lots of separate object files and lots of separate mmaps to manage. To
that end I plan on eventually using `ModuleTextBuilder` to build one
large text section for all core wasm modules and trampolines, meaning
that `ModuleTextBuilder` is no longer specific to one module. I've
extracted out functionality such as function name calculation as well as
relocation resolving (now a closure passed in) in preparation for this.
For now this just keeps tests passing, and the trajectory for this
should become more clear over the following commits.
* Remove component-specific object emission
This commit removes the `ComponentCompiler::emit_obj` function in favor
of `Compiler::emit_obj`, now renamed `append_code`. This involved
significantly refactoring code emission to take a flat list of functions
into `append_code` and the caller is responsible for weaving together
various "families" of functions and un-weaving them afterwards.
* Consolidate ELF parsing in `CodeMemory`
This commit moves the ELF file parsing and section iteration from
`CompiledModule` into `CodeMemory` so one location keeps track of
section ranges and such. This is in preparation for sharing much of this
code with components which needs all the same sections to get tracked
but won't be using `CompiledModule`. A small side benefit from this is
that the section parsing done in `CodeMemory` and `CompiledModule` is no
longer duplicated.
* Remove separately tracked traps in components
Previously components would generate an "always trapping" function
and the metadata around which pc was allowed to trap was handled
manually for components. With recent refactorings the Wasmtime-standard
trap section in object files is now being generated for components as
well which means that can be reused instead of custom-tracking this
metadata. This commit removes the manual tracking for the `always_trap`
functions and plumbs the necessary bits around to make components look
more like modules.
* Remove a now-unnecessary `Arc` in `Module`
Not expected to have any measurable impact on performance, but
complexity-wise this should make it a bit easier to understand the
internals since there's no longer any need to store this somewhere else
than its owner's location.
* Merge compilation artifacts of components
This commit is a large refactoring of the component compilation process
to produce a single artifact instead of multiple binary artifacts. The
core wasm compilation process is refactored as well to share as much
code as necessary with the component compilation process.
This method of representing a compiled component necessitated a few
medium-sized changes internally within Wasmtime:
* A new data structure was created, `CodeObject`, which represents
metadata about a single compiled artifact. This is then stored as an
`Arc` within a component and a module. For `Module` this is always
uniquely owned and represents a shuffling around of data from one
owner to another. For a `Component`, however, this is shared amongst
all loaded modules and the top-level component.
* The "module registry" which is used for symbolicating backtraces and
for trap information has been updated to account for a single region
of loaded code holding possibly multiple modules. This involved adding
a second-level `BTreeMap` for now. This will likely slow down
instantiation slightly but if it poses an issue in the future this
should be able to be represented with a more clever data structure.
This commit additionally solves a number of longstanding issues with
components such as compiling only one host-to-wasm trampoline per
signature instead of possibly once-per-module. Additionally the
`SignatureCollection` registration now happens once-per-component
instead of once-per-module-within-a-component.
* Fix compile errors from prior commits
* Support AOT-compiling components
This commit adds support for AOT-compiled components in the same manner
as `Module`, specifically adding:
* `Engine::precompile_component`
* `Component::serialize`
* `Component::deserialize`
* `Component::deserialize_file`
Internally the support for components looks quite similar to `Module`.
All the prior commits to this made adding the support here
(unsurprisingly) easy. Components are represented as a single object
file as are modules, and the functions for each module are all piled
into the same object file next to each other (as are areas such as data
sections). Support was also added here to quickly differentiate compiled
components vs compiled modules via the `e_flags` field in the ELF
header.
* Prevent serializing exported modules on components
The current representation of a module within a component means that the
implementation of `Module::serialize` will not work if the module is
exported from a component. The reason for this is that `serialize`
doesn't actually do anything and simply returns the underlying mmap as a
list of bytes. The mmap, however, has `.wasmtime.info` describing
component metadata as opposed to this module's metadata. While rewriting
this section could be implemented it's not so easy to do so and is
otherwise seen as not super important of a feature right now anyway.
* Fix windows build
* Fix an unused function warning
* Update crates/environ/src/compilation.rs
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Refactor metadata storage in AOT artifacts
This commit is a reorganization of how metadata is stored in Wasmtime's
compiled artifacts. Currently Wasmtime's ELF artifacts have data
appended after them to contain metadata about the `Engine` as well as
type information for the module itself. This extra data at the end of
the file is ignored by ELF-related utilities generally and is assembled
during the module serialization process.
In working on AOT-compiling components, though, I've discovered a number
of issues with this:
* Primarily it's possible to mistakenly change an artifact if it's
deserialized and then serialized again. This issue is probably
theoretical but the deserialized artifact records the `Engine`
configuration at time of creation but when re-serializing that it
serializes the current `Engine` state, not the original `Engine`
state.
* Additionally the serialization strategy here is tightly coupled to
`Module` and its serialization format. While this makes sense it is
not conducive for future refactorings to use a similar serialization
format for components. The engine metadata, for example, does not
necessarily need to be tied up with type information.
* The storage for this extra metadata is a bit wonky by shoving it at
the end of the ELF file. The original reason for this was to have a
compiled artifact be multiple objects concatenated with each other to
support serializing module-linking-using modules. Module linking is no
longer a thing and I have since decided that for the component model
all compilation artifacts will go into one object file to assist
debugability. This means that the extra stick-it-at-the-end is no
longer necessary.
To solve these issues this commit splits up the
`module/serialization.rs` file in two, mostly moving the logic to
`engine/serialization.rs`. The engine serialization logic now handles
everything related to `Engine` compatibility such as targets, compiler
flags, wasm features, etc. The module serialization logic is now
exclusively interested in type information.
The engine metadata and serialized type information additionally live in
sections of the final file now instead of at the end. This means that
there are three primary `bincode`-encoded sections that are parsed on
deserializing a file:
1. The `Engine`-specific metadata. This will be the same for both
modules and components.
2. The `CompiledModuleInfo` structure. For core wasm there's just one of
these but for the component model there will be multiple, one per
core wasm module.
3. The type information. For core wasm this is a `ModuleTypes` but for a
component this will be a `ComponentTypes`.
No true functional change is expected from this commit. Binary artifacts
might get inflated by a small handful of bytes due to using ELF sections
to represent this now.
A related change I made during this commit as well was the plumbing of
the `is_branch_protection_enabled` flag. This is technically
`Engine`-level metadata but I didn't want to plumb it all over the place
as was done now, so instead a new section was added to the final binary
just for this bti information. This means that it no longer needs to be
a parameter to `CodeMemory::publish` and additionally is more amenable
to a `Component`-is-just-one-object world where no single module owns
this piece of metadata.
* Exclude some functions in a cranelift-less build
* Reduce calls to `section_by_name` loading artifacts
Data is stored in binary artifacts as an ELF object and when loading an
artifact lots of calls are made to the `object` crate's
`section_by_name` method which ends up doing a linear search through the
list of sections for a particular name. To avoid doing this linear
search every time I've replaced this with one loop over the sections of
an object at the beginning when an object is loaded, or at least most of
the calls with this loop.
This isn't really a pressing issue today but some upcoming work I hope
to do for AOT-compiled components will be adding more sections to the
artifact so it seems best to keep the number of linear searches small
and avoided if possible.
* Fix an off-by-one
* cranelift: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* wasmtime: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* cranelift: Add MemoryUse flag to JIT Memory Manager
This allows us to keep the icache flushing code self-contained and not leak implementation details.
This also changes the windows icache flushing code to only flush pages that were previously unflushed.
* Add jit-icache-coherence crate
* cranelift: Use `jit-icache-coherence`
* wasmtime: Use `jit-icache-coherence`
* jit-icache-coherence: Make rustix feature additive
Mutually exclusive features cause issues.
* wasmtime: Remove rustix from wasmtime-jit
We now use it via jit-icache-coherence
* Rename wasmtime-jit-icache-coherency crate
* Use cfg-if in wasmtime-jit-icache-coherency crate
* Use inline instead of inline(always)
* Add unsafe marker to clear_cache
* Conditionally compile all rustix operations
membarrier does not exist on MacOS
* Publish `wasmtime-jit-icache-coherence`
* Remove explicit windows check
This is implied by the target_os = "windows" above
* cranelift: Remove len != 0 check
This is redundant as it is done in non_protected_allocations_iter
* Comment cleanups
Thanks @akirilov-arm!
* Make clear_cache safe
* Rename pipeline_flush to pipeline_flush_mt
* Revert "Make clear_cache safe"
This reverts commit 21165d81c9030ed9b291a1021a367214d2942c90.
* More docs!
* Fix pipeline_flush reference on clear_cache
* Update more docs!
* Move pipeline flush after `mprotect` calls
Technically the `clear_cache` operation is a lie in AArch64, so move the pipeline flush after the `mprotect` calls so that it benefits from the implicit cache cleaning done by it.
* wasmtime: Remove rustix backend from icache crate
* wasmtime: Use libc for macos
* wasmtime: Flush icache on all arch's for windows
* wasmtime: Add flags to membarrier call
Minor thing I noticed from #4990 but I stylistically prefer to keep the
`mod foo;` definitions canonicalized to one location to emphasize how
multiple targets can use the same definition.
* Make wasmtime build for windows-aarch64
* Add check for win arm64 build.
* Fix checks for winarm64 key in workflows.
* Add target in windows arm64 build.
* Add tracking issue for Windows ARM64 trap handling
* Leverage Cargo's workspace inheritance feature
This commit is an attempt to reduce the complexity of the Cargo
manifests in this repository with Cargo's workspace-inheritance feature
becoming stable in Rust 1.64.0. This feature allows specifying fields in
the root workspace `Cargo.toml` which are then reused throughout the
workspace. For example this PR shares definitions such as:
* All of the Wasmtime-family of crates now use `version.workspace =
true` to have a single location which defines the version number.
* All crates use `edition.workspace = true` to have one default edition
for the entire workspace.
* Common dependencies are listed in `[workspace.dependencies]` to avoid
typing the same version number in a lot of different places (e.g. the
`wasmparser = "0.89.0"` is now in just one spot.
Currently the workspace-inheritance feature doesn't allow having two
different versions to inherit, so all of the Cranelift-family of crates
still manually specify their version. The inter-crate dependencies,
however, are shared amongst the root workspace.
This feature can be seen as a method of "preprocessing" of sorts for
Cargo manifests. This will help us develop Wasmtime but shouldn't have
any actual impact on the published artifacts -- everything's dependency
lists are still the same.
* Fix wasi-crypto tests
* Update to cap-std 0.26.
This is primarily to pull in bytecodealliance/cap-std#271, the fix for #4936,
compilation on Rust nightly on Windows.
It also updates to rustix 0.35.10, to pull in bytecodealliance/rustix#403,
the fix for bytecodealliance/rustix#402, compilation on newer versions of
the libc crate, which changed a public function from `unsafe` to safe.
Fixes#4936.
* Update the system-interface audit for 0.23.
* Update the libc supply-chain config version.
* Initial forward-edge CFI implementation
Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.
Copyright (c) 2022, Arm Limited.
* Refactor `from_artifacts` to avoid second `make_executable` (#1)
This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.
* Address the code review feedback
Copyright (c) 2022, Arm Limited.
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
This commit replaces #4869 and represents the actual version bump that
should have happened had I remembered to bump the in-tree version of
Wasmtime to 1.0.0 prior to the branch-cut date. Alas!
As described in #4523, the `ittapi` dependency necessary for Wasmtime's
VTune support does not compile on `aarch64-linux-android`. There are
several incompatible parts here: though `ittapi` supports all OS
combinations that Wasmtime does and builds on all CPU targets, VTune is
not primarily intended for aarch64 profiling and `linux-android` is not
a high priority platform for the library. We could conditionally depend
on `ittapi` for Wasmtime's supported OS combinations, but I think a
better answer is to limit it to x86_64 targets, since this more clearly
shows why the `ittapi`/VTune support is present and also fixes#4523.
This commit removes Wasmtime's dependency on the `region` crate. The
motivation for this came about when I was updating dependencies and saw
that `region` had a new major version at 3.0.0 as opposed to our
currently used 2.3 track. In reviewing the use cases of `region` within
Wasmtime I found two trends in particular which motivated this commit:
* Some unix-specific areas of `wasmtime_runtime` use
`rustix::mm::mprotect` instead of `region::protect` already. This
means that the usage of `region::protect` for changing virtual memory
protections was already inconsistent.
* Many uses of `region::protect` were already in unix-specific regions
which could make use of `rustix`.
Overall I opted to remove the dependency on the `region` crate to avoid
chasing its versions over time. Unix-specific changes of protections
were easily changed to `rustix::mm::mprotect`. There were two locations
where a windows/unix split is now required and I subjectively ruled
"that seems ok". Finally removing `region` also meant that the "what is
the current page size" query needed to be inlined into
`wasmtime_runtime`, which I have also subjectively ruled "that seems
fine".
Finally one final refactoring here was that the `unix.rs` and `linux.rs`
split for the pooling allocator was merged. These two files already only
differed in one function so I slapped a `cfg_if!` in there to help
reduce the duplication.
* Migrate from `winapi` to `windows-sys`
I believe that Microsoft itself is supporting the development of
`windows-sys` and it's also used by `cap-std` now so this switches
Wasmtime's dependencies on Windows APIs from the `winapi` crate to the
`windows-sys` crate. We still have `winapi` in our dependency graph but
that may get phased out over time.
* Make windows-sys a target-specific dependency
* Change how unwind information is stored on Windows
Unwind information on Windows is stored in two separate locations. The
first location is the unwind information itself which corresponds to
`UNWIND_INFO`. The second location is a list of `RUNTIME_INFO`
structures which point to function bodes and `UNWIND_INFO` structures.
Currently in Wasmtime the `UNWIND_INFO` structures are stored just after
functions themselves with a somewhat cryptic comment indicating that
Windows prefers this (I'm unsure as to the provenance of this comment).
The `RUNTIME_INFO` data is then stored in a separate section which has
the custom name of `_wasmtime_winx64_unwind`.
After my recent foray into trying to debug windows-2022 bad unwind
information again I realized though that Windows actually has official
sections for these two unwind information items. The `.xdata` section is
used to store the `UNWIND_INFO` structures and the `.pdata` section
stores the `RUNTIME_INFO` list. To try to be somewhat idiomatic and
perhaps one day even hook into standard Windows debugging tools I went
ahead and refactored how our unwind information is stored to match this.
Perhaps the main benefit of this is that it reduces the size of the
read/execute section of the binary. Previously the unwind information
was executable since it was stored in the `.text` section, but
unnecessarily so. Now it's in a read-only section which is in theory a
small amount of hardening.
Otherwise though I don't think this will really help all that much to
hook up in to standard debugging tools like `objdump` because it's all
still stored in an ELF file rather than a COFF file.
* Review comments
This updates to rustix 0.35.6, and updates wasi-common to use cap-std 0.25 and
windows-sys (instead of winapi).
Changes include:
- Better error code mappings on Windows.
- Fixes undefined references to `utimensat` on Darwin.
- Fixes undefined references to `preadv64` and `pwritev64` on Android.
- Updates to io-lifetimes 0.7, which matches the io_safety API in Rust.
- y2038 bug fixes for 32-bit platforms
* Remove the `Paged` memory initialization variant
This commit simplifies the `MemoryInitialization` enum by removing the
`Paged` variant. The `Paged` variant was originally added for uffd, but
that support has now been removed in #4040. This is no longer necessary
but is still used as an intermediate step of becoming a `Static` variant
of initialized memory (which copy-on-write uses). As a result this
commit largely modifies the static initialization of memory steps and
folds the two methods together.
* Apply suggestions from code review
Co-authored-by: Peter Huene <peter@huene.dev>
Co-authored-by: Peter Huene <peter@huene.dev>
* Run a `cargo update` over our dependencies
This'll notably fix a `cargo audit` error where we have a pinned version
of the `regex` crate which has a CVE assigned to it.
* Update to `object` and `hashbrown` crates
Prune some duplicate versions showing up from the previous `cargo update`
* Reduce contention on the global module rwlock
This commit intendes to close#4025 by reducing contention on the global
rwlock Wasmtime has for module information during instantiation and
dropping a store. Currently registration of a module into this global
map happens during instantiation, but this can be a hot path as
embeddings may want to, in parallel, instantiate modules.
Instead this switches to a strategy of inserting into the global module
map when a `Module` is created and then removing it from the map when
the `Module` is dropped. Registration in a `Store` now preserves the
entire `Module` within the store as opposed to trying to only save it
piecemeal. In reality the only piece that wasn't saved within a store
was the `TypeTables` which was pretty inconsequential for core wasm
modules anyway.
This means that instantiation should now clone a singluar `Arc` into a
`Store` per `Module` (previously it cloned two) with zero managemnt on
the global rwlock as that happened at `Module` creation time.
Additionally dropping a `Store` again involves zero rwlock management
and only a single `Arc` drop per-instantiated module (previously it was
two).
In the process of doing this I also went ahead and removed the
`Module::new_with_name` API. This has been difficult to support
historically with various variations on the internals of `ModuleInner`
because it involves mutating a `Module` after it's been created. My hope
is that this API is pretty rarely used and/or isn't super important, so
it's ok to remove.
Finally this change removes some internal `Arc` layerings that are no
longer necessary, attempting to use either `T` or `&T` where possible
without dealing with the overhead of an `Arc`.
Closes#4025
* Move back to a `BTreeMap` in `ModuleRegistry`
Relevant to Wasmtime, this fixes undefined references to `utimensat` and
`futimens` on macOS 10.12 and earlier. See bytecodealliance/rustix#157
for details.
It also contains a fix for s390x which isn't currently needed by Wasmtime
itself, but which is needed to make rustix's own testsuite pass on s390x,
which helps people packaging rustix for use in Wasmtime. See
bytecodealliance/rustix#277 for details.
* Upgrade all crates to the Rust 2021 edition
I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.
* Fix compile of the C API
* Fix a warning
* Fix another warning
* Bump to 0.36.0
* Add a two-week delay to Wasmtime's release process
This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:
* In #3915 it was realized that changes just before the 0.35.0 release
weren't enough for an embedding use case, but the PR didn't meet the
expectations for a full patch release.
* At Fastly we were about to start rolling out a new version of Wasmtime
when over the weekend the fuzz bug #3951 was found. This led to the
desire internally to have a "must have been fuzzed for this long"
period of time for Wasmtime changes which we felt were better
reflected in the release process itself rather than something about
Fastly's own integration with Wasmtime.
This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.
Overall there are some new properties about Wasmtime with this proposal
as well:
* The `main` branch will always have a section in `RELEASES.md` which is
listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
release. For example it will be bump pre-emptively as part of the
release process on the 5th where if `release-2.0.0` was created then
the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
`RELEASES.md` notes.
The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.
* Add notes on a security patch
* Clarify security fixes shouldn't be previewed early on CI
* Remove duplicate `TypeTables` type
This was once needed historically but it is no longer needed.
* Make the internals of `TypeTables` private
Instead of reaching internally for the `wasm_signatures` map an `Index`
implementation now exists to indirect accesses through the type of the
index being accessed. For the component model this table of types will
grow a number of other tables and this'll assist in consuming sites not
having to worry so much about which map they're reaching into.
* Update to rustix 0.33.5, to fix a link error on Android
This updates to rustix 0.33.5, which includes bytecodealliance/rustix#258,
which fixes bytecodealliance/rustix#256, a link error on Android.
Fixes#3965.
* Bump the rustix versions in the Cargo.toml files too.
* Remove the module linking implementation in Wasmtime
This commit removes the experimental implementation of the module
linking WebAssembly proposal from Wasmtime. The module linking is no
longer intended for core WebAssembly but is instead incorporated into
the component model now at this point. This means that very large parts
of Wasmtime's implementation of module linking are no longer applicable
and would change greatly with an implementation of the component model.
The main purpose of this is to remove Wasmtime's reliance on the support
for module-linking in `wasmparser` and tooling crates. With this
reliance removed we can move over to the `component-model` branch of
`wasmparser` and use the updated support for the component model.
Additionally given the trajectory of the component model proposal the
embedding API of Wasmtime will not look like what it looks like today
for WebAssembly. For example the core wasm `Instance` will not change
and instead a `Component` is likely to be added instead.
Some more rationale for this is in #3941, but the basic idea is that I
feel that it's not going to be viable to develop support for the
component model on a non-`main` branch of Wasmtime. Additionaly I don't
think it's viable, for the same reasons as `wasm-tools`, to support the
old module linking proposal and the new component model at the same
time.
This commit takes a moment to not only delete the existing module
linking implementation but some abstractions are also simplified. For
example module serialization is a bit simpler that there's only one
module. Additionally instantiation is much simpler since the only
initializer we have to deal with are imports and nothing else.
Closes#3941
* Fix doc link
* Update comments
This commit exposes some various details and config options for having
finer-grain control over mlock-ing the memory of modules. This amounts
to three different changes being present in this commit:
* A new `Module::image_range` API is added to expose the range in host
memory of where the compiled image resides. This enables embedders to
make mlock-ing decisions independently of Wasmtime. Otherwise though
there's not too much useful that can be done with this range
information at this time.
* A new `Config::force_memory_init_memfd` option has been added. This
option is used to force the usage of `memfd_create` on Linux even when
the original module comes from a file on disk. With mlock-ing the main
purpose for Wasmtime is likely to be avoiding major page faults that
go back to disk, so this is another major source of avoiding page
faults by ensuring that the initialization contents of memory are
always in RAM.
* The `memory_images` field of a `Module` has gone back to being lazily
created on the first instantiation, effectively reverting #3914. This
enables embedders to defer the creation of the image to as late as
possible to allow modules to be created from precompiled images
without actually loading all the contents of the data segments from
disk immediately.
These changes are all somewhat low-level controls which aren't intended
to be generally used by embedders. If fine-grained control is desired
though it's hoped that these knobs provide what's necessary to be
achieved.
* fuzz: Fuzz padding between compiled functions
This commit hooks up the custom
`wasmtime_linkopt_padding_between_functions` configuration option to the
cranelift compiler into the fuzz configuration, enabling us to ensure
that randomly inserting a moderate amount of padding between functions
shouldn't tamper with any results.
* fuzz: Fuzz the `Config::generate_address_map` option
This commit adds fuzz configuration where `generate_address_map` is
either enabled or disabled, unlike how it's always enabled for fuzzing
today.
* Remove unnecessary handling of relocations
This commit removes a number of bits and pieces all related to handling
relocations in JIT code generated by Wasmtime. None of this is necessary
nowadays that the "old backend" has been removed (quite some time ago)
and relocations are no longer expected to be in the JIT code at all.
Additionally with the minimum x86_64 features required to run wasm code
it should be expected that no libcalls are required either for
Wasmtime-based JIT code.
* x64: enable VTune support by default
After significant work in the `ittapi-rs` crate, this dependency should
build without issue on Wasmtime's supported operating systems: Windows,
Linux, and macOS. The difference in the release binary is <20KB, so this
change makes `vtune` a default build feature. This change upgrades
`ittapi-rs` to v0.2.0 and updates the documentation.
* review: add configuration for defaults in more places
* review: remove OS conditional compilation, add architecture
* review: do not default vtune feature in wasmtime-jit
* Skip memfd creation with precompiled modules
This commit updates the memfd support internally to not actually use a
memfd if a compiled module originally came from disk via the
`wasmtime::Module::deserialize_file` API. In this situation we already
have a file descriptor open and there's no need to copy a module's heap
image to a new file descriptor.
To facilitate a new source of `mmap` the currently-memfd-specific-logic
of creating a heap image is generalized to a new form of
`MemoryInitialization` which is attempted for all modules at
module-compile-time. This means that the serialized artifact to disk
will have the memory image in its entirety waiting for us. Furthermore
the memory image is ensured to be padded and aligned carefully to the
target system's page size, notably meaning that the data section in the
final object file is page-aligned and the size of the data section is
also page aligned.
This means that when a precompiled module is mapped from disk we can
reuse the underlying `File` to mmap all initial memory images. This
means that the offset-within-the-memory-mapped-file can differ for
memfd-vs-not, but that's just another piece of state to track in the
memfd implementation.
In the limit this waters down the term "memfd" for this technique of
quickly initializing memory because we no longer use memfd
unconditionally (only when the backing file isn't available).
This does however open up an avenue in the future to porting this
support to other OSes because while `memfd_create` is Linux-specific
both macOS and Windows support mapping a file with copy-on-write. This
porting isn't done in this PR and is left for a future refactoring.
Closes#3758
* Enable "memfd" support on all unix systems
Cordon off the Linux-specific bits and enable the memfd support to
compile and run on platforms like macOS which have a Linux-like `mmap`.
This only works if a module is mapped from a precompiled module file on
disk, but that's better than not supporting it at all!
* Fix linux compile
* Use `Arc<File>` instead of `MmapVecFileBacking`
* Use a named struct instead of mysterious tuples
* Comment about unsafety in `Module::deserialize_file`
* Fix tests
* Fix uffd compile
* Always align data segments
No need to have conditional alignment since their sizes are all aligned
anyway
* Update comment in build.rs
* Use rustix, not `region`
* Fix some confusing logic/names around memory indexes
These functions all work with memory indexes, not specifically defined
memory indexes.
* Move function names out of `Module`
This commit moves function names in a module out of the
`wasmtime_environ::Module` type and into separate sections stored in the
final compiled artifact. Spurred on by #3787 to look at module load
times I noticed that a huge amount of time was spent in deserializing
this map. The `spidermonkey.wasm` file, for example, has a 3MB name
section which is a lot of unnecessary data to deserialize at module load
time.
The names of functions are now split out into their own dedicated
section of the compiled artifact and metadata about them is stored in a
more compact format at runtime by avoiding a `BTreeMap` and instead
using a sorted array. Overall this improves deserialize times by up to
80% for modules with large name sections since the name section is no
longer deserialized at load time and it's lazily paged in as names are
actually referenced.
* Fix a typo
* Fix compiled module determinism
Need to not only sort afterwards but also first to ensure the data of
the name section is consistent.
Following up on #3696, use the new is-terminal crate to test for a tty
rather than having platform-specific logic in Wasmtime. The is-terminal
crate has a platform-independent API which takes a handle.
This also updates the tree to cap-std 0.24 etc., to avoid depending on
multiple versions of io-lifetimes at once, as enforced by the cargo deny
check.
As first suggested by Jan on the Zulip here [1], a cheap and effective
way to obtain copy-on-write semantics of a "backing image" for a Wasm
memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism
provided by the Linux kernel allows us to create anonymous,
in-memory-only files that we can use for this mapping, so we can
construct the image contents on-the-fly then effectively create a CoW
overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)`
will discard the CoW overlay, returning the mapping to its original
state.
By itself this is almost enough for a very fast
instantiation-termination loop of the same image over and over,
without changing the address space mapping at all (which is
expensive). The only missing bit is how to implement
heap *growth*. But here memfds can help us again: if we create another
anonymous file and map it where the extended parts of the heap would
go, we can take advantage of the fact that a `mmap()` mapping can
be *larger than the file itself*, with accesses beyond the end
generating a `SIGBUS`, and the fact that we can cheaply resize the
file with `ftruncate`, even after a mapping exists. So we can map the
"heap extension" file once with the maximum memory-slot size and grow
the memfd itself as `memory.grow` operations occur.
The above CoW technique and heap-growth technique together allow us a
fastpath of `madvise()` and `ftruncate()` only when we re-instantiate
the same module over and over, as long as we can reuse the same
slot. This fastpath avoids all whole-process address-space locks in
the Linux kernel, which should mean it is highly scalable. It also
avoids the cost of copying data on read, as the `uffd` heap backend
does when servicing pagefaults; the kernel's own optimized CoW
logic (same as used by all file mmaps) is used instead.
[1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772
`ptr::cast` has the advantage of being unable to silently cast
`*const T` to `*mut T`. This turned up several places that were
performing such casts, which this PR also fixes.
* Provide helpers for demangling function names
* Profile trampolines in vtune too
* get rid of mapping
* avoid code duplication with jitdump_linux
* maintain previous default display name for wasm functions
* no dash, grrr
* Remove unused profiling error type
Before this PR, each profiler (perf/vtune, at the moment) had to have a
demangler for each of the programming languages that could have been
compiled to wasm and fed into wasmtime. With this, wasmtime now
demangles names before even forwarding them to the underlying profiler,
which makes for a unified representation in profilers, and avoids
incorrect demangling in profilers.