Under some test case layouts the call relocation
panicking with an underflow. Use `wrapping_sub` to
signal that this is expected.
The fuzzer took a while to generate such a test case.
And I can't introduce it as a regression test because
when running via the regular clif-util run tests the
layout is different and the test case passes!
I think this is because in the fuzzer we only add
one trampoline, while in clif-util we build trampolines
for each funcion in the file.
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
* Fix some warnings on nightly Rust
Cargo is warning about the usage of workspace dependencies where the
workspace declaration does not mention `default-features` but the
dependency mentions `default-features`, so this explicitly turns off
default features for `cranelift-codegen` at the workspace level and
removes the explicit `default-features = false` at the manifest levels.
* Explicitly enable default feature in wasmtime
* Enable another feature
* add clif-util compile option to output object file
* switch from a box to a borrow
* update objectmodule tests to use borrowed isa
* put targetisa into an arc
The sample program in cranelift/filetests/src/function_runner.rs
would abort with an mprotect failure under certain circumstances,
see https://github.com/bytecodealliance/wasmtime/pull/4453#issuecomment-1303803222
Root cause was that enabling PROT_EXEC on the main process heap
may be prohibited, depending on Linux distro and version.
This only shows up in the doc test sample program because the main
clif-util is multi-threaded and therefore allocations will happen
on glibc's per-thread heap, which is allocated via mmap, and not
the main process heap.
Work around the problem by enabling the "selinux-fix" feature of
the cranelift-jit crate dependency in the filetests. Note that
this didn't compile out of the box, so a separate fix is also
required and provided as part of this PR.
Going forward, it would be preferable to always use mmap to allocate
the backing memory for JITted code.
* cranelift: improve syscall error/oom handling in JIT module
The JIT module has several places where it `expect`s or `panic`s
on syscall or allocator errors. For example, `mmap` and `mprotect`
can fail if Linux `vm.max_map_count` is not high enough, and some
users may wish to handle this error rather than immediately
crashing.
This commit plumbs these errors upward as new `ModuleError`
types, so that callers of jit module functions like
`finalize_definitions` and `define_function` can handle them
(or just `unwrap()`, as desired).
* cranelift: Remove ModuleError::Syscall variant
Syscall errors can just be folded into the generic Backend error,
which is an anyhow::Error
* cranelift-jit: return io::ErrorKind::OutOfMemory for alloc failure
Just using `io::Error::last_os_error()` is not correct as global
allocator impls are not required to set errno
* cranelift: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* wasmtime: Add FlushInstructionCache for AArch64 on Windows
This was previously done on #3426 for linux.
* cranelift: Add MemoryUse flag to JIT Memory Manager
This allows us to keep the icache flushing code self-contained and not leak implementation details.
This also changes the windows icache flushing code to only flush pages that were previously unflushed.
* Add jit-icache-coherence crate
* cranelift: Use `jit-icache-coherence`
* wasmtime: Use `jit-icache-coherence`
* jit-icache-coherence: Make rustix feature additive
Mutually exclusive features cause issues.
* wasmtime: Remove rustix from wasmtime-jit
We now use it via jit-icache-coherence
* Rename wasmtime-jit-icache-coherency crate
* Use cfg-if in wasmtime-jit-icache-coherency crate
* Use inline instead of inline(always)
* Add unsafe marker to clear_cache
* Conditionally compile all rustix operations
membarrier does not exist on MacOS
* Publish `wasmtime-jit-icache-coherence`
* Remove explicit windows check
This is implied by the target_os = "windows" above
* cranelift: Remove len != 0 check
This is redundant as it is done in non_protected_allocations_iter
* Comment cleanups
Thanks @akirilov-arm!
* Make clear_cache safe
* Rename pipeline_flush to pipeline_flush_mt
* Revert "Make clear_cache safe"
This reverts commit 21165d81c9030ed9b291a1021a367214d2942c90.
* More docs!
* Fix pipeline_flush reference on clear_cache
* Update more docs!
* Move pipeline flush after `mprotect` calls
Technically the `clear_cache` operation is a lie in AArch64, so move the pipeline flush after the `mprotect` calls so that it benefits from the implicit cache cleaning done by it.
* wasmtime: Remove rustix backend from icache crate
* wasmtime: Use libc for macos
* wasmtime: Flush icache on all arch's for windows
* wasmtime: Add flags to membarrier call
* Leverage Cargo's workspace inheritance feature
This commit is an attempt to reduce the complexity of the Cargo
manifests in this repository with Cargo's workspace-inheritance feature
becoming stable in Rust 1.64.0. This feature allows specifying fields in
the root workspace `Cargo.toml` which are then reused throughout the
workspace. For example this PR shares definitions such as:
* All of the Wasmtime-family of crates now use `version.workspace =
true` to have a single location which defines the version number.
* All crates use `edition.workspace = true` to have one default edition
for the entire workspace.
* Common dependencies are listed in `[workspace.dependencies]` to avoid
typing the same version number in a lot of different places (e.g. the
`wasmparser = "0.89.0"` is now in just one spot.
Currently the workspace-inheritance feature doesn't allow having two
different versions to inherit, so all of the Cranelift-family of crates
still manually specify their version. The inter-crate dependencies,
however, are shared amongst the root workspace.
This feature can be seen as a method of "preprocessing" of sorts for
Cargo manifests. This will help us develop Wasmtime but shouldn't have
any actual impact on the published artifacts -- everything's dependency
lists are still the same.
* Fix wasi-crypto tests
* Initial forward-edge CFI implementation
Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.
Copyright (c) 2022, Arm Limited.
* Refactor `from_artifacts` to avoid second `make_executable` (#1)
This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.
* Address the code review feedback
Copyright (c) 2022, Arm Limited.
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
This commit replaces #4869 and represents the actual version bump that
should have happened had I remembered to bump the in-tree version of
Wasmtime to 1.0.0 prior to the branch-cut date. Alas!
Add a function_alignment function to the TargetIsa trait, and use it to align functions when generating objects. Additionally, collect the maximum alignment required for pc-relative constants in functions and pass that value out. Use the max of these two values when padding functions for alignment.
This fixes a bug on x86_64 where rip-relative loads to sse registers could cause a segfault, as functions weren't always guaranteed to be aligned to 16-byte addresses.
Fixes#4812
This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime.
After the suggestion of Chris, `Function` has been split into mostly two parts:
- on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`.
- on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on.
Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache:
- most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set.
- user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`.
- some refactorings have been made for function names:
- `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name.
- The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with.
The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions.
A basic fuzz target has been introduced that tries to do the bare minimum:
- check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function.
- check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache.
- This last check is less efficient and less likely to happen, so probably should be rethought a bit.
Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip.
Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement.
Fixes#4155.
This enables the object backend for s390x, in particular the
processing of all required relocations.
This uncovered a bug: we need to use PLT relocations for the
target of calls, which we currently do not. Fixed by adding
a new S390xPLTRel32Dbl reloc type and using it where needed.
* cranelift: Use JIT in runtests
Using `cranelift-jit` in run tests allows us to preform relocations and
libcalls. This is important since some instruction lowerings fallback
to libcall's when an extension is missing, or when it's too complicated
to implement manually.
This is also a first step to being able to test `call`'s between functions
in the runtest suite. It should also make it easier to eventually test
TLS relocations, symbol resolution and ABI's.
Another benefit of this is that we also get to test the JIT more, since
it now runs the runtests, and gets some fuzzing via `fuzzgen` (which
uses the `SingleFunctionCompiler`).
This change causes regressions in terms of runtime for the filetests.
I haven't done any serious benchmarking but what I've been seeing is
that it now takes about ~3 seconds to run the testsuite while it
previously took around 2 seconds.
* Add FMA tests for X86
* Move `emit_to_memory` to `MachCompileResult`
This small refactoring makes it clearer to me that emitting to memory
doesn't require anything else from the compilation `Context`. While it's
a trivial change, it's a small public API change that shouldn't cause
too much trouble, and doesn't seem RFC-worthy. Happy to hear different
opinions about this, though!
* hide the MachCompileResult behind a method
* Add a `CompileError` wrapper type that references a `Function`
* Rename MachCompileResult to CompiledCode
* Additionally remove the last unsafe API in cranelift-codegen
* Remove unused srcloc in MachReloc
* Remove unused srcloc in MachTrap
* Use `into_iter` on array in bench code to suppress a warning
* Remove unused srcloc in MachCallSite
* Upgrade all crates to the Rust 2021 edition
I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.
* Fix compile of the C API
* Fix a warning
* Fix another warning
* Bump to 0.36.0
* Add a two-week delay to Wasmtime's release process
This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:
* In #3915 it was realized that changes just before the 0.35.0 release
weren't enough for an embedding use case, but the PR didn't meet the
expectations for a full patch release.
* At Fastly we were about to start rolling out a new version of Wasmtime
when over the weekend the fuzz bug #3951 was found. This led to the
desire internally to have a "must have been fuzzed for this long"
period of time for Wasmtime changes which we felt were better
reflected in the release process itself rather than something about
Fastly's own integration with Wasmtime.
This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.
Overall there are some new properties about Wasmtime with this proposal
as well:
* The `main` branch will always have a section in `RELEASES.md` which is
listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
release. For example it will be bump pre-emptively as part of the
release process on the 5th where if `release-2.0.0` was created then
the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
`RELEASES.md` notes.
The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.
* Add notes on a security patch
* Clarify security fixes shouldn't be previewed early on CI
Addresses #3809: when we are asked to create a Cranelift backend with
shared flags that indicate support for SIMD, we should check that the
ISA level needed for our SIMD lowerings is present.
`ptr::cast` has the advantage of being unable to silently cast
`*const T` to `*mut T`. This turned up several places that were
performing such casts, which this PR also fixes.