CL/aarch64 back end: implement the wasm SIMD bitmask instructions

The `bitmask.{8x16,16x8,32x4}` instructions do not map neatly to any single
AArch64 SIMD instruction, and instead need a sequence of around ten
instructions.  Because of this, this patch is somewhat longer and more complex
than it would be for (eg) x64.

Main changes are:

* the relevant testsuite test (`simd_boolean.wast`) has been enabled on aarch64.

* at the CLIF level, add a new instruction `vhigh_bits`, into which these wasm
  instructions are to be translated.

* in the wasm->CLIF translation (code_translator.rs), translate into
  `vhigh_bits`.  This is straightforward.

* in the CLIF->AArch64 translation (lower_inst.rs), translate `vhigh_bits`
  into equivalent sequences of AArch64 instructions.  There is a different
  sequence for each of the `{8x16, 16x8, 32x4}` variants.

All other changes are AArch64-specific, and add instruction definitions needed
by the previous step:

* Add two new families of AArch64 instructions: `VecShiftImm` (vector shift by
  immediate) and `VecExtract` (effectively a double-length vector shift)

* To the existing AArch64 family `VecRRR`, add a `zip1` variant.  To the
  `VecLanesOp` family add an `addv` variant.

* Add supporting code for the above changes to AArch64 instructions:
  - getting the register uses (`aarch64_get_regs`)
  - mapping the registers (`aarch64_map_regs`)
  - printing instructions
  - emitting instructions (`impl MachInstEmit for Inst`).  The handling of
    `VecShiftImm` is a bit complex.
  - emission tests for new instructions and variants.
This commit is contained in:
Julian Seward
2020-10-22 16:02:46 +02:00
committed by julian-seward1
parent b10e027fef
commit 2702942050
8 changed files with 570 additions and 5 deletions

View File

@@ -1600,6 +1600,10 @@ pub fn translate_operator<FE: FuncEnvironment + ?Sized>(
let bool_result = builder.ins().vall_true(a);
state.push1(builder.ins().bint(I32, bool_result))
}
Operator::I8x16Bitmask | Operator::I16x8Bitmask | Operator::I32x4Bitmask => {
let a = pop1_with_bitcast(state, type_of(op), builder);
state.push1(builder.ins().vhigh_bits(I32, a));
}
Operator::I8x16Eq | Operator::I16x8Eq | Operator::I32x4Eq => {
translate_vector_icmp(IntCC::Equal, type_of(op), builder, state)
}
@@ -1763,10 +1767,7 @@ pub fn translate_operator<FE: FuncEnvironment + ?Sized>(
| Operator::F64x2Trunc
| Operator::F64x2PMin
| Operator::F64x2PMax
| Operator::F64x2Nearest
| Operator::I8x16Bitmask
| Operator::I16x8Bitmask
| Operator::I32x4Bitmask => {
| Operator::F64x2Nearest => {
return Err(wasm_unsupported!("proposed SIMD operator {:?}", op));
}