Commit Graph

17 Commits

Author SHA1 Message Date
Anton Kirilov
f59b274d22 Cranelift AArch64: Further vector constant improvements
Introduce support for MOVI/MVNI with 16-, 32-, and 64-bit elements,
and the vector variant of FMOV.

Copyright (c) 2020, Arm Limited.
2020-12-03 15:30:24 +00:00
Chris Fallin
d413b907b4 Merge pull request #2414 from jgouly/extend-refactor
arm64: Refactor Inst::Extend handling
2020-11-25 17:22:07 -08:00
Chris Fallin
712ff22492 AArch64 SIMD: pattern-match load+splat into LD1R instruction. 2020-11-16 15:59:28 -08:00
Joey Gouly
70cbc4ca7c arm64: Refactor Inst::Extend handling
This refactors the handling of Inst::Extend and simplifies the lowering
of Bextend and Bmask, which allows the use of SBFX instructions for
extensions from 1-bit booleans. Other extensions use aliases of BFM,
and the code was changed to reflect that, rather than hard coding bit
patterns. Also ImmLogic is now implemented, so another hard coded
instruction can be removed.

As part of looking at boolean handling, `normalize_boolean_result` was
changed to `materialize_boolean_result`, such that it can use either
CSET or CSETM. Using CSETM saves an instruction (previously CSET + SUB)
for booleans bigger than 1-bit.

Copyright (c) 2020, Arm Limited.
2020-11-13 16:17:25 +00:00
Chris Fallin
113d061129 Merge pull request #2369 from akirilov-arm/move_fix
Cranelift AArch64: Various small fixes
2020-11-12 14:59:10 -08:00
Chris Fallin
89dbc4590d Merge pull request #2363 from cfallin/extend-only-if-abi
Do value-extensions at ABI boundaries only when ABI requires it.
2020-11-12 12:26:20 -08:00
Anton Kirilov
edaada3f57 Cranelift AArch64: Various small fixes
* Use FMOV to move 64-bit FP registers and SIMD vectors.
* Add support for additional vector load types.
* Fix the printing of Inst::LoadAddr.

Copyright (c) 2020, Arm Limited.
2020-11-12 13:54:05 +00:00
Chris Fallin
997b654235 Merge pull request #2393 from jgouly/constant-addend
arm64: Fold some constants into load instructions
2020-11-11 11:23:21 -08:00
Joey Gouly
a5011e8212 arm64: Fold some constants into load instructions
This changes the following:
  mov x0, #4
  ldr x0, [x1, #4]

Into:
  ldr x0, [x1]

I noticed this pattern (but with #0), in a benchmark.

Copyright (c) 2020, Arm Limited.
2020-11-11 18:47:43 +00:00
Julian Seward
41e87a2f99 Support wasm select instruction with V128-typed operands on AArch64.
* this requires upgrading to wasmparser 0.67.0.

* There are no CLIF side changes because the CLIF `select` instruction is
  polymorphic enough.

* on aarch64, there is unfortunately no conditional-move (csel) instruction on
  vectors.  This patch adds a synthetic instruction `VecCSel` which *does*
  behave like that.  At emit time, this is emitted as an if-then-else diamond
  (4 insns).

* aarch64 implementation is otherwise straightforwards.
2020-11-11 18:45:24 +01:00
Chris Fallin
a2bbb198de Do value-extensions at ABI boundaries only when ABI requires it.
There has been some confusion over the meaning of the "sign-extend"
(`sext`) and "zero-extend" (`uext`) attributes on parameters and return
values in signatures. According to the three implemented backends, these
attributes indicate that a value narrower than a full register should
always be extended in the way specified. However, they are much more
useful if they mean "extend in this way if the ABI requires extending":
only the ABI backend knows whether or not a particular ABI (e.g., x64
SysV vs. x64 Baldrdash) requires extensions, while only the frontend
(CLIF generator) knows whether or not a value is signed, so the two have
to work in concert.

This is the result of some very helpful discussion in #2354 (thanks to
@uweigand for raising the issue and @bjorn3 for helping to reason about
it).

This change respects the extension attributes in the above way, rather
than unconditionally extending, to avoid potential performance
degradation as we introduce more extension attributes on signatures.
2020-11-05 11:54:35 -08:00
Julian Seward
dd9bfcefaa CL/aarch64: implement the wasm SIMD v128.load{32,64}_zero instructions.
This patch implements, for aarch64, the following wasm SIMD extensions.

  v128.load32_zero and v128.load64_zero instructions
  https://github.com/WebAssembly/simd/pull/237

The changes are straightforward:

* no new CLIF instructions.  They are translated into an existing CLIF scalar
  load followed by a CLIF `scalar_to_vector`.

* the comment/specification for CLIF `scalar_to_vector` has been changed to
  match the actual intended semantics, per consulation with Andrew Brown.

* translation from `scalar_to_vector` to aarch64 `fmov` instruction.  This
  has been generalised slightly so as to allow both 32- and 64-bit transfers.

* special-case zero in `lower_constant_f128` in order to avoid a
  potentially slow call to `Inst::load_fp_constant128`.

* Once "Allow loads to merge into other operations during instruction
  selection in MachInst backends"
  (https://github.com/bytecodealliance/wasmtime/issues/2340) lands,
  we can use that functionality to pattern match the two-CLIF pair and
  emit a single AArch64 instruction.

* A simple filetest has been added.

There is no comprehensive testcase in this commit, because that is a separate
repo.  The implementation has been tested, nevertheless.
2020-11-04 20:00:04 +01:00
Anton Kirilov
207779fe1d Cranelift AArch64: Improve code generation for vector constants
In particular, introduce initial support for the MOVI and MVNI
instructions, with 8-bit elements. Also, treat vector constants
as 32- or 64-bit floating-point numbers, if their value allows
it, by relying on the architectural zero extension. Finally,
stop generating literal loads for 32-bit constants.

Copyright (c) 2020, Arm Limited.
2020-10-30 13:16:12 +00:00
Chris Fallin
71768bb6cf Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs.
This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Fixes #2254.
2020-10-06 14:44:02 -07:00
Joey Gouly
eec60c9b06 arm64: Use SignedOffset rather than PreIndexed addressing mode for callee-saved registers
This also passes `fixed_frame_storage_size` (previously `total_sp_adjust`)
into `gen_clobber_save` so that it can be combined with other stack
adjustments.

Copyright (c) 2020, Arm Limited.
2020-10-02 16:22:55 +01:00
Anton Kirilov
d18de69e5a AArch64: Add test cases for callee-saved SIMD & FP registers
Copyright (c) 2020, Arm Limited.
2020-09-30 14:19:02 +01:00
Andrew Brown
b43f4a464a refactor: move all 'filetests/vcode' tests to 'filetests/isa' 2020-09-29 09:27:39 -07:00