This avoids doing multiple unpacking of the InstructionData for a single
legalization, improving readability and reducing size of the generated
code. For instance, icmp had to unpack the format once per IntCC
condition code.
There are two reasons for this change:
1. it reduces confusion; using the `POR` encoding will match the future encodings of `band` and `bxor` and the `ORPS` encoding may be confusing as it is intended for floating-point operations
2. `POR` has slightly more throughput: it only has to wait 0.33 cycles to execute again on all Intel architectures above Core whereas `ORPS` must wait 1 cycle on architectures older than Skylake (Intel Optimization Reference Manual, C.3)
`POR` does add one additional byte to the encoding and requires SSE2 so the `ORPS` opcode is left in for future use.
This change should make the code more clear (and less code) when adding encodings for instructions with specific immediates; e.g., a constant with a 0 immediate could be encoded as an XOR with something like `const.bind(...)` without explicitly creating the necessary predicates. It has several parts:
* Introduce Bindable trait to instructions
* Convert all instruction bindings to use Bindable::bind()
* Add ability to bind immediates to BoundInstruction
This is an attempt to reduce some of the issues in #955.
Only i16x8 and i32x4 are encoded in this commit mainly because i8x16 and i64x2 do not have simple encodings in x86. i64x2 is not required by the SIMD spec and there is discussion (https://github.com/WebAssembly/simd/pull/98#issuecomment-530092217) about removing i8x16.
The x86 ISA has (at least) two encodings for PEXTRW:
1. in the SSE2 opcode (66 0f c5) the XMM operand uses r/m and the GPR operand uses reg
2. in the SSE4.1 opcode (66 0f 3a 15) the XMM operand uses reg and the GPR operand uses r/m
This changes the 16-bit x86_pextr encoding from 1 to 2 to match the other PEXTR* implementations (all #2 style).
Instead of using MOVUPS to expensively load bits from memory, this change uses a predicate to optimize vconst without a memory access:
- when the 128-bit immediate is all zeroes in all bits, use PXOR to zero out an XMM register
- when the 128-bit immediate is all ones in all bits, use PCMPEQB to set an XMM register to all ones
This leaves the constant data in the constant pool, which may increase code size (TODO)
- `fill` attempted to use a GPR recipe, `fillSib32`, instead of its FPR equivalent, `ffillSib32` (code compiled without error but incorrect instructions were allowed, e.g. `v1 = regmove v0, %rdi -> %xmm0`
- `regmove` could be encoded with a GPR recipe, `rmov`, which hid the above incorrectness; now only FPR-to-FPR regmoves are allowed using the `frmov` recipe
This commit is based on the assumption that floats are already stored in XMM registers in x86. When extracting a lane, cranelift was moving the float to a regular register and back to an XMM register; this change avoids this by shuffling the float value to the lowest bits of the XMM register. It also assumes that the upper bits can be left as is (instead of zeroing them out).
raw_bitcast matches the intent of this legalization more clearly (to simply change the CLIF type without changing any bits) and the additional null encodings added are necessary for later instructions
* [codegen] add new recipe "rout"
Add a new recipe "rout" intended to be used by arithematic operations
that output flags, currently being used for `iadd_cout` and `isub_bout`.
Fixes: https://github.com/CraneStation/cranelift/issues/1009
* [codegen] add encodings for iadd carry variants
Add encodings for iadd carry variants (iadd_cout, iadd_cin, iadd_carry)
for x86_32, enabling the legalization for iadd.i64 to work.
* [codegen] remove support for iadd carry variants on riscv
Previously, the carry variants of iadd (iadd_cin, iadd_cout and
iadd_carry) were being legalized for isa/riscv since RISC architectures
lack a flags register.
This forced us to return and accept booleans for these operations, which
proved to be problematic and inconvenient, especially for x86.
This commit removes support for said statements and all dependent
statements for isa/riscv so that we can work on a better legalization
strategy in the future.
* [codegen] change operand type from bool to iflag for iadd carry variants
The type of the carry operands for the carry variants of the iadd
instruction (iadd_cin, iadd_cout, iadd_carry) was bool for compatibility
reasons for isa/riscv. Since support for these instructions on RISC
architectures has been temporarily suspended, we can safely change the
type to iflags.
Also, as a ridealong fix, removes R32 encodings for x86_64 in `enc_r32_r64`,
since the type `rXX` by definition only exists for targets with word size `XX`
bits.
* Fix segfault due to b64 encoding
Prior to this patch, bconst.b64 encoded its instruction with a 32-bit immediate that caused improper decoding of the MOV instruction; instead, use a REX prefix and rely on zero-extension of the immediate. Fixes#911.