In order to implement SIMD's all_true (https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#all-lanes-true), we must legalize some instruction (I chose `vall_true`) to a comparison against 0 and a similar reduction as vany_true using `PTEST` and `SETNZ`. Since `icmp` only allows integers but `vall_true` could allow more vector types, `raw_bitcast` is used to convert the lane types into integers, e.g. b32x4 to i32x4. To do so without runtime type-checking, the `raw_bitcast` instruction (which emits no instruction) can now bitcast from any vector type to the same type, e.g. i32x4 to i32x4.
Only the shifts with applicable SSE2 instructions are implemented here: PSRL* (for ushr) only has 16-64 bit instructions and PSRA* (for sshr) only has 16-32 bit instructions.
This avoids doing multiple unpacking of the InstructionData for a single
legalization, improving readability and reducing size of the generated
code. For instance, icmp had to unpack the format once per IntCC
condition code.
This change should make the code more clear (and less code) when adding encodings for instructions with specific immediates; e.g., a constant with a 0 immediate could be encoded as an XOR with something like `const.bind(...)` without explicitly creating the necessary predicates. It has several parts:
* Introduce Bindable trait to instructions
* Convert all instruction bindings to use Bindable::bind()
* Add ability to bind immediates to BoundInstruction
This is an attempt to reduce some of the issues in #955.
This commit is based on the assumption that floats are already stored in XMM registers in x86. When extracting a lane, cranelift was moving the float to a regular register and back to an XMM register; this change avoids this by shuffling the float value to the lowest bits of the XMM register. It also assumes that the upper bits can be left as is (instead of zeroing them out).
raw_bitcast matches the intent of this legalization more clearly (to simply change the CLIF type without changing any bits) and the additional null encodings added are necessary for later instructions