This is useful to have to allow resumable_trap to happen in loop
headers, for instance. This is the correct way to implement interrupt
checks in Spidermonkey, which are effectively resumable traps. Previous
implementation was using traps, which is wrong, since traps semantically
can't be resumed after.
This converts an `i32x4` into an `f32x4` with some rounding either by using an AVX512VL/F instruction--VCVTUDQ2PS--or a long sequence of SSE4.1 compatible instructions.
The `convert_i64x2_imul` custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes `imul.i64x2` to an `x86_pmullq` in this case; if not, it uses a lengthy SSE2-compatible instruction sequence.
Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.
The InsertLane format has an ordering (`value().imm().value()`) and immediate name (`"lane"`) that make it awkward to use for other instructions. This changes the ordering (`value().value().imm()`) and uses the default name (`"imm"`) throughout the codebase.
* Encode vselect using BLEND instructions on x86
* Legalize vselect to bitselect
* Optimize bitselect to vselect for some operands
* Add run tests for bitselect-vselect optimization
* Address review feedback
Certain operations (e.g. x86_packss) will have operands with types like `NxM` but will return results with types like `(N/2)x(M*2)` (halve the lane width, double the number of lanes; maintain the same number of vector bits). This is equivalent to applying two `DerivedFunction`s to the type: `DerivedFunction::HalfWidth` then `DerivedFunction::DoubleVector`. Since there is no easy way to apply multiple `DerivedFunction`s (e.g. most of the logic is one-level deep, 1d5a678124/cranelift/codegen/meta/src/gen_inst.rs (L618-L621)), I added `DerivedFunction::SplitLanes` to do the necessary type conversion.
This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.
This involves some large mask tables that may hurt code size but reduce the number of instructions. See https://github.com/WebAssembly/simd/issues/117 for a more in-depth discussion on this.
- Add a `simple_legalize()` function that invokes a predetermined set of
legalizations, without depending on the details of the current
backend design. This will be used by the new backend pipeline.
- Separate out `has_side_effect()` from the DCE pass. This will be used
by the new backends' lowering code.
- Add documentation for the `Arm64Call` relocation type.
* Wasmtime 0.15.0 and Cranelift 0.62.0. (#1398)
* Bump more ad-hoc versions.
* Add build.rs to wasi-common's Cargo.toml.
* Update the env var name in more places.
* Remove a redundant echo.
In cranelift x86 encodings, it seemed unintuitive to specialize Templates with both `infer_rex()`` and `w()`: if `w()` is specified, the REX.W bit must be set so a REX prefix is alway required--no need to infer it. This change forces us to write `rex().w()``--it's more explicit and shows more clearly what cranelift will emit. This change also modifies the tests that expected DynRex recipes.