CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructions.

This patch implements, for aarch64, the following wasm SIMD extensions.

  v128.load32_zero and v128.load64_zero instructions
  https://github.com/WebAssembly/simd/pull/237

The changes are straightforward:

* no new CLIF instructions.  They are translated into an existing CLIF scalar
  load followed by a CLIF `scalar_to_vector`.

* the comment/specification for CLIF `scalar_to_vector` has been changed to
  match the actual intended semantics, per consulation with Andrew Brown.

* translation from `scalar_to_vector` to aarch64 `fmov` instruction.  This
  has been generalised slightly so as to allow both 32- and 64-bit transfers.

* special-case zero in `lower_constant_f128` in order to avoid a
  potentially slow call to `Inst::load_fp_constant128`.

* Once "Allow loads to merge into other operations during instruction
  selection in MachInst backends"
  (https://github.com/bytecodealliance/wasmtime/issues/2340) lands,
  we can use that functionality to pattern match the two-CLIF pair and
  emit a single AArch64 instruction.

* A simple filetest has been added.

There is no comprehensive testcase in this commit, because that is a separate
repo.  The implementation has been tested, nevertheless.
This commit is contained in:
Julian Seward
2020-11-03 17:34:07 +01:00
committed by julian-seward1
parent 285edeec3e
commit dd9bfcefaa
9 changed files with 144 additions and 35 deletions

View File

@@ -1426,6 +1426,18 @@ pub fn translate_operator<FE: FuncEnvironment + ?Sized>(
let (load, dfg) = builder.ins().Load(opcode, result_ty, flags, offset, base);
state.push1(dfg.first_result(load))
}
Operator::V128Load32Zero { memarg } | Operator::V128Load64Zero { memarg } => {
translate_load(
memarg,
ir::Opcode::Load,
type_of(op).lane_type(),
builder,
state,
environ,
)?;
let as_vector = builder.ins().scalar_to_vector(type_of(op), state.pop1());
state.push1(as_vector)
}
Operator::I8x16ExtractLaneS { lane } | Operator::I16x8ExtractLaneS { lane } => {
let vector = pop1_with_bitcast(state, type_of(op), builder);
let extracted = builder.ins().extractlane(vector, lane.clone());
@@ -1790,10 +1802,6 @@ pub fn translate_operator<FE: FuncEnvironment + ?Sized>(
Operator::ReturnCall { .. } | Operator::ReturnCallIndirect { .. } => {
return Err(wasm_unsupported!("proposed tail-call operator {:?}", op));
}
Operator::V128Load32Zero { .. } | Operator::V128Load64Zero { .. } => {
return Err(wasm_unsupported!("proposed SIMD operator {:?}", op));
}
};
Ok(())
}
@@ -2516,7 +2524,8 @@ fn type_of(operator: &Operator) -> Type {
| Operator::I32x4MaxU
| Operator::F32x4ConvertI32x4S
| Operator::F32x4ConvertI32x4U
| Operator::I32x4Bitmask => I32X4,
| Operator::I32x4Bitmask
| Operator::V128Load32Zero { .. } => I32X4,
Operator::I64x2Splat
| Operator::V128Load64Splat { .. }
@@ -2528,7 +2537,8 @@ fn type_of(operator: &Operator) -> Type {
| Operator::I64x2ShrU
| Operator::I64x2Add
| Operator::I64x2Sub
| Operator::I64x2Mul => I64X2,
| Operator::I64x2Mul
| Operator::V128Load64Zero { .. } => I64X2,
Operator::F32x4Splat
| Operator::F32x4ExtractLane { .. }