* Implement `Swizzle` and `Splat` for interpreter
Implemented for the Cranelift interpreter:
- `Swizzle` to shuffle an `i8x16` SIMD vector based
on the indices specified in another vector of the same size.
- `Splat` to create a SIMD vector with all lanes having the same value.
Copyright (c) 2021, Arm Limited
* Fix old x86 backend failing test
Copyright (c) 2021, Arm Limited
* Represent i16x8 and above as hex
Copyright (c) 2021, Arm Limited
* cranelift: Implement ZeroExtend for a bunch of types in interpreter
* cranelift: Implement VConst on interpreter
* cranelift: Implement VallTrue on interpreter
* cranelift: Implement VanyTrue on interpreter
* cranelift: Mark `v{all,any}_true` tests as machinst only
* cranelift: Disable `vany_true` tests on aarch64
The `b64x2` case produces an illegal instruction. See #3305
Implemented `SaddSat` and `SsubSat` to add and subtract signed vector
values, saturating at the type boundaries rather than overflowing.
Changed the parser to allow signed `i8` immediates in vectors as part of
this work; fixes#3276.
Copyright (c) 2021, Arm Limited.
- Fixed CI tests for AArch64 and old x86.
- Rename `simd-umulhi.clif` to `umulhi.clif`.
- Rename `simd-umulhi-aarch64.clif` to `simd-umulhi.clif`.
Copyright (c) 2021, Arm Limited.
Implemented `Umulhi` for the Cranelift interpreter, performing unsigned
integer multiplication and producing the high half of a double-length
result.
Fixed `ExtractUpper` conversion behaviour as part of this change, which
was extracting from a 128-bit value regardless of the size of the
original value.
Copyright (c) 2021, Arm Limited.
Implemented `Insertlane` to insert a value in the lane specified by the
immediate value, overwriting the existing value in that lane.
Added `TernaryImm8` support for the `imm_value` function.
Copyright (c) 2021, Arm Limited.
Almost all the tests in the interpreter are already in the runtests
folder so that we can reuse them for the backends. The distinction
between interpreter tests and runtests is no longer very clear, since
they should both support the same clif code, and produce the same results.
We only have two test files:
* `add.clif` tests the add and jump instruction, both of which are
already covered in other test files, so we remove that file.
* `fibonacci.clif` does a recursive call which is currently not supported
in the filetest environment, so we keep this test interpreter only for now.
Implemented `IaddPairwise` for the Cranelift interpreter, to add pairs
of adjacent values in two SIMD vectors, concatenating them at the end
(preserving both lane size and number of lanes).
Copyright (c) 2021, Arm Limited
* Implement `IaddCin`, `IaddCout`, and `IaddCarry` for Cranelift interpreter
Implemented the following Opcodes for the Cranelift interpreter:
- `IaddCin` to add two scalar integers with an input carry flag.
- `IaddCout` to add two scalar integers and report overflow with the carry flag.
- `IaddCarry` to add two scalar integers with an input carry flag, reporting overflow with the output carry flag.
Copyright (c) 2021, Arm Limited
* Simplify carry check + add i64 `IaddCarry` tests
Copyright (c) 2021, Arm Limited
* Move tests to `runtests`
Copyright (c) 2021, Arm Limited
* Cranelift AArch64: Simplify leaf functions that do not use the stack
Leaf functions that do not use the stack (e.g. do not clobber any
callee-saved registers) do not need a frame record.
Copyright (c) 2021, Arm Limited.
* Implement `Extractlane`, `UaddSat`, and `UsubSat` for Cranelift interpreter
Implemented the `Extractlane`, `UaddSat`, and `UsubSat` opcodes for the interpreter,
and added helper functions for working with SIMD vectors (`extractlanes`, `vectorizelanes`,
and `binary_arith`).
Copyright (c) 2021, Arm Limited
* Re-use tests + constrict Vector assert
- Re-use interpreter tests as runtests where supported.
- Constrict Vector assertion.
- Code style adjustments following feedback.
Copyright (c) 2021, Arm Limited
* Runtest `i32x4` vectors on AArch64; add `i64x2` tests
Copyright (c) 2021, Arm Limited
* Add `simd-` prefix to test filenames
Copyright (c) 2021, Arm Limited
* Return aliased `SmallVec` from `extractlanes`
Using a `SmallVec<[i128; 4]>` allows larger-width 128-bit vectors
(`i32x4`, `i64x2`, ...) to not cause heap allocations.
Copyright (c) 2021, Arm Limited
* Accept slice to `vectorizelanes` rather than `Vec`
Copyright (c) 2021, Arm Limited
* cranelift: Add stack support to the interpreter
We also change the approach for heap loads and stores.
Previously we would use the offset as the address to the heap. However,
this approach does not allow using the load/store instructions to
read/write from both the heap and the stack.
This commit changes the addressing mechanism of the interpreter. We now
return the real addresses from the addressing instructions
(stack_addr/heap_addr), and instead check if the address passed into
the load/store instructions points to an area in the heap or the stack.
* cranelift: Add virtual addresses to cranelift interpreter
Adds a Virtual Addressing scheme that was discussed as a better
alternative to returning the real addresses.
The virtual addresses are split into 4 regions (stack, heap, tables and
global values), and the address itself is composed of an `entry` field
and an `offset` field. In general the `entry` field corresponds to the
instance of the resource (e.g. table5 is entry 5) and the `offset` field
is a byte offset inside that entry.
There is one exception to this which is the stack, where due to only
having one stack, the whole address is an offset field.
The number of bits in entry vs offset fields is variable with respect to
the `region` and the address size (32bits vs 64bits). This is done
because with 32 bit addresses we would have to compromise on heap size,
or have a small number of global values / tables. With 64 bit addresses
we do not have to compromise on this, but we need to support 32 bit
addresses.
* cranelift: Remove interpreter trap codes
* cranelift: Calculate frame_offset when entering or exiting a frame
* cranelift: Add safe read/write interface to DataValue
* cranelift: DataValue write full 128bit slot for booleans
* cranelift: Use DataValue accessors for trampoline.
* cranelift: Add heap support to filetest infrastructure
* cranelift: Explicit heap pointer placement in filetest annotations
* cranelift: Add documentation about the Heap directive
* cranelift: Clarify that heap filetests pointers must be laid out sequentially
* cranelift: Use wrapping add when computing bound pointer
* cranelift: Better error messages when invalid signatures are found for heap file tests.
Implemented the following Opcodes for the Cranelift interpreter:
- `IsubBin` to subtract two scalar integers with an input borrow flag.
- `IsubBout` to subtract two scalar integers with an output borrow flag.
- `IsubBorrow` to subtract two scalar integers with an input and output borrow flag.
Copyright (c) 2021, Arm Limited
The tests for the SIMD floating-point maximum and minimum operations
require particular care because the handling of the NaN values is
non-deterministic and may vary between platforms. There is no way to
match several NaN values in a test, so the solution is to extract the
non-deterministic test cases into a separate file that is subsequently
replicated for every backend under test, with adjustments made to the
expected results.
Copyright (c) 2021, Arm Limited.
The AArch64 support was a bit broken and was using Armv7 style
barriers, which aren't required with Armv8 acquire-release
load/stores.
The fallback CAS loops and RMW, for AArch64, have also been updated
to use acquire-release, exclusive, instructions which, again, remove
the need for barriers. The CAS loop has also been further optimised
by using the extending form of the cmp instruction.
Copyright (c) 2021, Arm Limited.
* Change VMMemoryDefinition::current_length to `usize`
This commit changes the definition of
`VMMemoryDefinition::current_length` to `usize` from its previous
definition of `u32`. This is a pretty impactful change because it also
changes the cranelift semantics of "dynamic" heaps where the bound
global value specifier must now match the pointer type for the platform
rather than the index type for the heap.
The motivation for this change is that the `current_length` field (or
bound for the heap) is intended to reflect the current size of the heap.
This is bound by `usize` on the host platform rather than `u32` or`
u64`. The previous choice of `u32` couldn't represent a 4GB memory
because we couldn't put a number representing 4GB into the
`current_length` field. By using `usize`, which reflects the host's
memory allocation, this should better reflect the size of the heap and
allows Wasmtime to support a full 4GB heap for a wasm program (instead
of 4GB minus one page).
This commit also updates the legalization of the `heap_addr` clif
instruction to appropriately cast the address to the platform's pointer
type, handling bounds checks along the way. The practical impact for
today's targets is that a `uextend` is happening sooner than it happened
before, but otherwise there is no intended impact of this change. In the
future when 64-bit memories are supported there will likely need to be
fancier logic which handles offsets a bit differently (especially in the
case of a 64-bit memory on a 32-bit host).
The clif `filetest` changes should show the differences in codegen, and
the Wasmtime changes are largely removing casts here and there.
Closes#3022
* Add tests for memory.size at maximum memory size
* Add a dfg helper method
Cranelift crates have historically been much more verbose with debug-level
logging than most other crates in the Rust ecosystem. We log things like how
many parameters a basic block has, the color of virtual registers during
regalloc, etc. Even for Cranelift hackers, these things are largely only useful
when hacking specifically on Cranelift and looking at a particular test case,
not even when using some Cranelift embedding (such as Wasmtime).
Most of the time, when people want logging for their Rust programs, they do
something like:
RUST_LOG=debug cargo run
This means that they get all that mostly not useful debug logging out of
Cranelift. So they might want to disable logging for Cranelift, or change it to
a higher log level:
RUST_LOG=debug,cranelift=info cargo run
The problem is that this is already more annoying to type that `RUST_LOG=debug`,
and that Cranelift isn't one single crate, so you actually have to play
whack-a-mole with naming all the Cranelift crates off the top of your head,
something more like this:
RUST_LOG=debug,cranelift=info,cranelift_codegen=info,cranelift_wasm=info,...
Therefore, we're changing most of the `debug!` logs into `trace!` logs: anything
that is very Cranelift-internal, unlikely to be useful/meaningful to the
"average" Cranelift embedder, or prints a message for each instruction visited
during a pass. On the other hand, things that just report a one line statistic
for a whole pass, for example, are left as `debug!`. The more verbose the log
messages are, the higher the bar they must clear to be `debug!` rather than
`trace!`.
This commit addresses two issues:
* A panic when shifting any non i128 type by i128 amounts (#3064)
* Wrong results when lowering shifts with small types (i8, i16)
In these types when shifting for amounts larger than the size of the
type, we would not get the wrapping behaviour that we see on i32 and i64.
This is because in these larger types, the wrapping behaviour is automatically
implemented by using the appropriate instruction, however we do not
have i8 and i16 specific instructions, so we have to manually wrap
the shift amount with an AND instruction.
This issue is also found on x86_64 and s390x, and a separate issue will
be filed for those.
Closes#3064
When encoding constants as immediates into an RSE Imm12 instruction we need to take special care to check if the value that we are trying to input does not overflow its type when viewed as a signed value. (i.e. iconst.i8 200)
We cannot both put an immediate and sign extend it, so we need to lower it into a separate reg, and emit the sign extend into the instruction.
For more details see the [cg_clif bug report](https://github.com/bjorn3/rustc_codegen_cranelift/issues/1184#issuecomment-873214796).
* cranelift: Initial fuzzer implementation
* cranelift: Generate multiple test cases in fuzzer
* cranelift: Separate function generator in fuzzer
* cranelift: Insert random instructions in fuzzer
* cranelift: Rename gen_testcase
* cranelift: Implement div for unsigned values in interpreter
* cranelift: Run all test cases in fuzzer
* cranelift: Comment options in function_runner
* cranelift: Improve fuzzgen README.md
* cranelift: Fuzzgen remove unused variable
* cranelift: Fuzzer code style fixes
Thanks! @bjorn3
* cranelift: Fix nits in CLIF fuzzer
Thanks @cfallin!
* cranelift: Implement Arbitrary for TestCase
* cranelift: Remove gen_testcase
* cranelift: Move fuzzers to wasmtime fuzz directory
* cranelift: CLIF-Fuzzer ignore tests that produce traps
* cranelift: CLIF-Fuzzer create new fuzz target to validate generated testcases
* cranelift: Store clif-fuzzer config in a separate struct
* cranelift: Generate variables upfront per function
* cranelift: Prevent publishing of fuzzgen crate