This patch makes spillslot allocation, spilling and reloading all based
on register class only. Hence when we have a 32- or 64-bit value in a
128-bit XMM register on x86-64 or vector register on aarch64, this
results in larger spillslots and spills/restores.
Why make this change, if it results in less efficient stack-frame usage?
Simply put, it is safer: there is always a risk when allocating
spillslots or spilling/reloading that we get the wrong type and make the
spillslot or the store/load too small. This was one contributing factor
to CVE-2021-32629, and is now the source of a fuzzbug in SIMD code that
puns an arbitrary user-controlled vector constant over another
stackslot. (If this were a pointer, that could result in RCE. SIMD is
not yet on by default in a release, fortunately.
In particular, we have not been particularly careful about using moves
between values of different types, for example with `raw_bitcast` or
with certain SIMD operations, and such moves indicate to regalloc.rs
that vregs are in equivalence classes and some arbitrary vreg in the
class is provided when allocating the spillslot or spilling/reloading.
Since regalloc.rs does not track actual type, and since we haven't been
careful about moves, we can't really trust this "arbitrary vreg in
equivalence class" to provide accurate type information.
In the fix to CVE-2021-32629 we fixed this for integer registers by
always spilling/reloading 64 bits; this fix can be seen as the analogous
change for FP/vector regs.
This updates our `push-tag.yml` workflow to check all commits to
`release-*` branches in addition to the `main` branch for cmomits which
should be tagged.
Actually poking around the `github` context it looks like for
workflow-dispatch-triggered-events `base_ref` is blank but `ref_name`
has the branch name we're interested in.
This was accidentally ommitted from our CI configuration which meant
that release branches didn't get PR CI. They still won't get on-merge CI
but that shouldn't be an issue because the PR CI is the full CI.
This update is no real change in functionality but brings in several of
the latest changes to the `ittapi-rs` library: minor fixes to the C
library, a new license expression for the Cargo crate, better
documentation, updated Rust bindings, and the removal of `cmake` as a
dependency (uses `cc` directly instead).
While using VTune, it seemed a good idea to check that the VTune
documentation for Wasmtime was still correct. It is and VTune support
still works (improvements: click-through to x86 assembly is not
available). These changes simply re-organize the documentation and add a
section for running VTune from a GUI.
* aarch64: Use smaller instruction helpers in ISLE
This commit moves the aarch64 backend's ISLE to be more similar to the
x64 backend's ISLE where one-liner instruction builders are used for
various forms of instructions instead of always using the
constructor-per-variant-of-`Inst`. Overall I think this change worked
out quite well and sets up some naming idioms as well for various forms
of instructions.
* rebase conflict
Fixes#3609. It turns out that `sha2` is a nontrivial dependency for
Cranelift in many contexts, partly because it pulls in a number of other
crates as well.
One option is to remove the hash check under certain circumstances, as
implemented in #3616. However, this is undesirable for other reasons:
having different dependency options in Wasmtime in particular for
crates.io vs. local builds is not really possible, and so either we
still have the higher build cost in Wasmtime, or we turn off the checks
by default, which goes against the original intent of ensuring developer
safety (no mysterious stale-source bugs).
This PR uses `SipHash` instead, which is built into the standard
library. `SipHash` is deprecated, but it's fixed and deterministic
(across runs and across Rust versions), which is what we need, unlike
the suggested replacement `std::collections::hash_map::DefaultHasher`.
The result is only 64 bits, and is not cryptographically secure, but we
never needed that; we just need a simple check to indicate when we
forget a `rebuild-isle`.
* Update to cap-std 0.22.0.
The main change relevant to Wasmtime here is that this includes the
rustix fix for compilation errors on Rust nightly with the `asm!` macro.
* Add itoa to deny.toml.
* Update the doc and fuzz builds to the latest Rust nightly.
* Update to libc 0.2.112 to pick up the `POLLRDHUP` fix.
* Update to cargo-fuzz 0.11, for compatibility with Rust nightly.
This appears to be the fix for rust-fuzz/cargo-fuzz#277.
This commit translates the `rotl` and `rotr` lowerings already existing
to ISLE. The port was relatively straightforward with the biggest
changing being the instructions generated around i128 rotl/rotr
primarily due to register changes.
* aarch64: Migrate ishl/ushr/sshr to ISLE
This commit migrates the `ishl`, `ushr`, and `sshr` instructions to
ISLE. These involve special cases for almost all types of integers
(including vectors) and helper functions for the i128 lowerings since
the i128 lowerings look to be used for other instructions as well. This
doesn't delete the i128 lowerings in the Rust code just yet because
they're still used by Rust lowerings, but they should be deletable in
due time once those lowerings are translated to ISLE.
* Use more descriptive names for i128 lowerings
* Use a with_flags-lookalike for csel
* Use existing `with_flags_*`
* Coment backwards order
* Update generated code
Uncovered by @bjorn3 (thanks!): 8- and 16-bit rotates were not working
properly in recent versions of Cranelift with part of the lowering
migrated to ISLE.
This PR fixes a few issues:
- 8- and 16-bit rotate-left needs to mask a constant amount, if any,
because we use a 32-bit rotate instruction and so don't get the
appropriate shift-amount masking for free from x86 semantics.
- `operand_size_from_type` was incorrect: it only handled 32- and 64-bit
types and silently returned `OperandSize::Size32` for everything else.
Now uses the `OperandSize::from_ty(ty)` helper as the pre-ISLE code
did.
Our test coverage for narrow value types is not great; this PR adds some
runtests for rotl/rotr but more would always be better!
Cranelift shuffles require indices to be in-bounds, which the
avx512-using backend also requires via a debug assert, so this commit
fixes a test with simd shuffles to only use in-bounds indices.
This is motivated by another failure on CI where the machine we were
running on presumably had avx512 things enabled. This should fix those
failures.
Closes#3581