Commit Graph

79 Commits

Author SHA1 Message Date
Trevor Elliott
ab4be2bdd1 ISLE: Resolve overlaps in the aarch64 backend (#4988) 2022-09-30 12:57:50 -07:00
Damian Heaton
3a2b32bf4d Port branches to ISLE (AArch64) (#4943)
* Port branches to ISLE (AArch64)

Ported the existing implementations of the following opcodes for AArch64
to ISLE:
- `Brz`
- `Brnz`
- `Brif`
- `Brff`
- `BrIcmp`
- `Jump`
- `BrTable`

Copyright (c) 2022 Arm Limited

* Remove dead code

Copyright (c) 2022 Arm Limited
2022-09-26 09:45:32 +01:00
Damian Heaton
3f8cccfb59 Port flag-based ops to ISLE (AArch64) (#4942)
Ported the existing implementations of the following opcodes for AArch64
to ISLE:
- `Trueif`
- `Trueff`
- `Trapif`
- `Trapff`
- `Select`
- `Selectif`
- `SelectifSpectreGuard`

Copyright (c) 2022 Arm Limited
2022-09-22 15:44:32 -07:00
Damian Heaton
352c7595c6 Improve fcvt_to_{u,s}int_sat lowering (AArch64) (#4913)
Improved the instruction lowering for the following opcodes on AArch64,
and introduced support for converting to integers less than 32-bits wide
as per the docs:
- `FcvtToSintSat`
- `FcvtToUintSat`

Copyright (c) 2022 Arm Limited
2022-09-21 10:16:09 -07:00
Damian Heaton
e786bda002 Vector bitcast support (AArch64 & Interpreter) (#4820)
* Vector bitcast support (AArch64 & Interpreter)

Implemented support for `bitcast` on vector values for AArch64 and the
interpreter.

Also corrected the verifier to ensure that the size, in bits, of the input and
output types match for a `bitcast`, per the docs.

Copyright (c) 2022 Arm Limited

* `I128` same-type bitcast support

Copyright (c) 2022 Arm Limited

* Directly return input for 64-bit GPR<=>GPR bitcast

Copyright (c) 2022 Arm Limited
2022-09-21 09:20:28 -07:00
Damian Heaton
e9b08b856d Port icmp to ISLE (AArch64) (#4898)
* Port `icmp` to ISLE (AArch64)

Ported the existing implementation of `icmp` (and, by extension, the
`lower_icmp` function) to ISLE for AArch64.

Copyright (c) 2022 Arm Limited

* Allow 'producer chains', eliminating `Nop0`s

Copyright (c) 2022 Arm Limited
2022-09-13 08:56:50 -07:00
Chris Fallin
2986f6b0ff ABI: implement register arguments with constraints. (#4858)
* ABI: implement register arguments with constraints.

Currently, Cranelift's ABI code emits a sequence of moves from physical
registers into vregs at the top of the function body, one for every
register-carried argument.

For a number of reasons, we want to move to operand constraints instead,
and remove the use of explicitly-named "pinned vregs"; this allows for
better regalloc in theory, as it removes the need to "reverse-engineer"
the sequence of moves.

This PR alters the ABI code so that it generates a single "args"
pseudo-instruction as the first instruction in the function body. This
pseudo-inst defs all register arguments, and constrains them to the
appropriate registers at the def-point. Subsequently the regalloc can
move them wherever it needs to.

Some care was taken not to have this pseudo-inst show up in
post-regalloc disassemblies, but the change did cause a general regalloc
"shift" in many tests, so the precise-output updates are a bit noisy.
Sorry about that!

A subsequent PR will handle the other half of the ABI code, namely, the
callsite case, with a similar preg-to-constraint conversion.

* Update based on review feedback.

* Review feedback.
2022-09-08 18:03:14 -07:00
Anton Kirilov
d8b290898c Initial forward-edge CFI implementation (#3693)
* Initial forward-edge CFI implementation

Give the user the option to start all basic blocks that are targets
of indirect branches with the BTI instruction introduced by the
Branch Target Identification extension to the Arm instruction set
architecture.

Copyright (c) 2022, Arm Limited.

* Refactor `from_artifacts` to avoid second `make_executable` (#1)

This involves "parsing" twice but this is parsing just the header of an
ELF file so it's not a very intensive operation and should be ok to do
twice.

* Address the code review feedback

Copyright (c) 2022, Arm Limited.

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-09-08 09:35:58 -05:00
Anton Kirilov
dd07e354b4 Cranelift AArch64: Fix the get_return_address lowering (#4851)
The previous implementation assumed that nothing had clobbered the
LR register since the current function had started executing, so
it would be incorrect for a non-leaf function, for example, that
contains the `get_return_address` operation right after a call.
The operation is valid only if the `preserve_frame_pointers` flag
is enabled, which implies that the presence of a frame record on
the stack is guaranteed.

Copyright (c) 2022, Arm Limited.
2022-09-07 11:09:22 -07:00
Anton Kirilov
48bf078c83 Cranelift AArch64: Fix the atomic memory operations (#4831)
Previously the implementations of the various atomic memory IR operations
ignored the memory operation flags that were passed.

Copyright (c) 2022, Arm Limited.

Co-authored-by: Chris Fallin <chris@cfallin.org>
2022-09-02 09:35:21 -07:00
Anton Kirilov
d2e19b8d74 Cranelift AArch64: Migrate AMode to ISLE (#4832)
Copyright (c) 2022, Arm Limited.

Co-authored-by: Chris Fallin <chris@cfallin.org>
2022-09-02 00:24:46 +00:00
Chris Fallin
ae5fe8a728 aarch64: fix up regalloc2 semantics. (#4830)
This PR removes all uses of modify-operands in the aarch64 backend,
replacing them with reused-input operands instead. This has the nice
effect of removing a bunch of move instructions and more clearly
representing inputs and outputs.

This PR also removes the explicit use of pinned vregs in the aarch64
backend, instead using fixed-register constraints on the operands when
insts or pseudo-inst sequences require certain registers.

This is the second PR in the regalloc-semantics cleanup series; after
the remaining backend (s390x) and the ABI code are cleaned up as well,
we'll be able to simplify the regalloc2 frontend.
2022-09-01 21:25:20 +00:00
Chris Fallin
1a59b3e6c6 AArch64: port tls_value to ISLE. (#4821) 2022-08-30 16:51:15 +00:00
Damian Heaton
3d9d759380 Port fcmp to ISLE (AArch64) (#4819)
Ported the existing implementation of `fcmp` for AArch64 to ISLE.

This also ports the `lower_vector_comparison` method to ISLE.

Copyright (c) 2022 Arm Limited
2022-08-30 09:06:15 -07:00
Chris Fallin
955d4e4ba1 AArch64: port load and store operations to ISLE. (#4785)
This retains `lower_amode` in the handwritten code (@akirilov-arm
reports that there is an upcoming patch to port this), but tweaks it
slightly to take a `Value` rather than an `Inst`.
2022-08-29 17:45:55 -07:00
Chris Fallin
a6eb24bd4f AArch64: port misc ops to ISLE. (#4796)
* Add some precise-output compile tests for aarch64.

* AArch64: port misc ops to ISLE.

- get_pinned_reg / set_pinned_reg
- bitcast
- stack_addr
- extractlane
- insertlane
- vhigh_bits
- iadd_ifcout
- fcvt_low_from_sint
2022-08-29 12:56:39 -07:00
Chris Fallin
8e8dfdf5f9 AArch64: Migrate calls and returns to ISLE. (#4788) 2022-08-26 16:26:39 -07:00
Damian Heaton
94bcbe8446 Port Fcopysign..FcvtToSintSat to ISLE (AArch64) (#4753)
* Port `Fcopysign`..``FcvtToSintSat` to ISLE (AArch64)

Ported the existing implementations of the following opcodes to ISLE on
AArch64:
- `Fcopysign`
  - Also introduced missing support for `fcopysign` on vector values, as
    per the docs.
  - This introduces the vector encoding for the `SLI` machine
    instruction.
- `FcvtToUint`
- `FcvtToSint`
- `FcvtFromUint`
- `FcvtFromSint`
- `FcvtToUintSat`
- `FcvtToSintSat`

Copyright (c) 2022 Arm Limited

* Document helpers and abstract conversion checks
2022-08-24 10:37:14 -07:00
Damian Heaton
3b68d76905 Port widening ops to ISLE (AArch64) (#4751)
Ported the existing implementations of the following opcodes for AArch64
to ISLE, and implemented support for 64-bit vectors (per the docs):
- `SwidenLow`
- `SwidenHigh`
- `UwidenLow`
- `UwidenHigh`

Also ported `WideningPairwiseDotProductS` as-is.

Copyright (c) 2022 Arm Limited
2022-08-23 09:42:11 -07:00
Damian Heaton
da1fb305a3 Port vconst to ISLE (AArch64) (#4750)
* Port `vconst` to ISLE (AArch64)

Ported the existing implementation of `vconst` to ISLE for AArch64, and
added support for 64-bit vector constants.

Also introduced 64-bit `vconst` support to the interpreter.

Copyright (c) 2022 Arm Limited

* Replace if-chains with match statements

Copyright (c) 2022 Arm Limited
2022-08-23 09:40:11 -07:00
Damian Heaton
418dbc15bd Port FuncAddr & SymbolValue to ISLE (AArch64) (#4748)
Ported the existing implementations of the following opcodes for AArch64
to ISLE:
- `FuncAddr`
- `SymbolValue`

Copyright (c) 2022 Arm Limited
2022-08-22 14:06:31 -07:00
Anton Kirilov
1481721c9d Enable back-edge CFI by default on macOS (#4720)
Also, adjust the tests that are executed on that platform. Finally,
fix a bug with obtaining backtraces when back-edge CFI is enabled.

Copyright (c) 2022, Arm Limited.
2022-08-17 15:06:20 -05:00
Afonso Bordado
863cbc345c cranelift: Fix icmp.i128 eq for aarch64 (#4706)
* cranelift: Fix `icmp.i128 eq` for aarch64

* cranelift: Use ccmp in `icmp.i128 eq` for aarch64
2022-08-15 11:11:22 -07:00
Damian Heaton
e463890f26 Port AvgRound & SqmulRoundSat to ISLE (AArch64) (#4639)
Ported the existing implementations of the following opcodes on AArch64
to ISLE:
- `AvgRound`
  - Also introduced support for `i64x2` vectors, as per the docs.
- `SqmulRoundSat`

Copyright (c) 2022 Arm Limited
2022-08-08 11:35:43 -07:00
Damian Heaton
47a67d752b Split Fmla and Bsl out into new VecRRRMod op (#4638)
Separates the following opcodes for AArch64 into a separate `VecALUModOp` enum,
which is emitted via the `VecRRRMod` instruction. This separates vector ALU
instructions which modify a register from instructions which write to a new register:
- `Bsl`
- `Fmla`

Addresses [a discussion](https://github.com/bytecodealliance/wasmtime/pull/4608#discussion_r937975581) in #4608.

Copyright (c) 2022 Arm Limited
2022-08-08 11:33:13 -07:00
Damian Heaton
eb332b8369 Convert fma, valltrue & vanytrue to ISLE (AArch64) (#4608)
* Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64)

Ported the existing implementations of the following opcodes to ISLE on
AArch64:
- `fma`
  - Introduced missing support for `fma` on vector values, as per the
    docs.
- `valltrue`
- `vanytrue`

Also fixed `fcmp` on scalar values in the interpreter, and enabled
interpreter tests in `simd-fma.clif`.

This introduces the `FMLA` machine instruction.

Copyright (c) 2022 Arm Limited

* Add comments for `Fmla` and `Bsl`

Copyright (c) 2022 Arm Limited
2022-08-05 09:47:56 -07:00
Damian Heaton
12a9705fbc Port Shuffle to ISLE (AArch64) (#4596)
* Port `Shuffle` to ISLE (AArch64)

Ported the existing implementation of `Shuffle` for AArch64 to ISLE.

Copyright (c) 2022 Arm Limited

* Cleanup by shadowing `rn`, `rn2`, and `_`

Copyright (c) 2022 Arm Limited
2022-08-04 08:43:23 -07:00
Anton Kirilov
a897742593 Initial back-edge CFI implementation (#3606)
Give the user the option to sign and to authenticate function
return addresses with the operations introduced by the Pointer
Authentication extension to the Arm instruction set architecture.

Copyright (c) 2021, Arm Limited.
2022-08-03 11:08:29 -07:00
Nick Fitzgerald
42bba452a6 Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573)
* Cranelift: Add instructions for getting the current stack/frame pointers and return address

This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535

* x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead

We just special case getting operands from `Amode`s now.

* Fix s390x `get_return_address`; require `preserve_frame_pointers=true`

* Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp

* Handle non-allocatable registers in Amode::with_allocs

* Use "stack" instead of "r15" on s390x

* r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`
2022-08-02 14:37:17 -07:00
Chris Fallin
43f1765272 Cranellift: remove Baldrdash support and related features. (#4571)
* Cranellift: remove Baldrdash support and related features.

As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.

In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.

This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the  understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425

* Review feedback.

* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.

The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.

* Review feedback.
2022-08-02 19:37:56 +00:00
Nick Fitzgerald
c77bec4dcb Cranelift: don't emit inside lowering rules for aarch64 (#4572)
* Cranelift: Don't `emit` inside lowering rules in aarch64

The lowering rules should be "pure" and side-effect free, using helpers defined
in `inst.isle` to perform actual side effects like emitting instructions.

* Cranelift: use 80 width for section separators in aarch64 lowering rules
2022-08-01 16:43:42 -07:00
Trevor Elliott
586ec95c11 ISLE: Allow shadowing in let expressions (#4562)
* Support shadowing in isle

* Re-run the isle build.rs if the examples change

* Print error messages when isle tests fail

* Move run tests

* Refactor `let` uses that don't need to introduce unique names
2022-08-01 21:10:28 +00:00
Anton Kirilov
a47a82d2e5 Cranelift AArch64: Harden the Spectre mitigations (#4555)
Use the `CSDB` instruction following Arm's recommendation.

Copyright (c) 2022, Arm Limited.
2022-08-01 10:20:48 -07:00
Damian Heaton
5e3bb588a8 Port Fence, IsNull/IsInvalid & Debugtrap to ISLE (AArch64) (#4548)
Ported the existing implementation of the following Opcodes for AArch64
to ISLE:
- `Fence`
- `IsNull`
- `IsInvalid`
- `Debugtrap`

Copyright (c) 2022 Arm Limited
2022-07-28 15:36:13 -07:00
Trevor Elliott
7ac6134894 x64: Shrink Inst from 72 to 48 bytes (#4514)
https://github.com/bytecodealliance/wasmtime/pull/4514
2022-07-27 10:39:22 -07:00
Anton Kirilov
ead6edb0c5 Cranelift AArch64: Migrate Splat to ISLE (#4521)
Copyright (c) 2022, Arm Limited.
2022-07-26 17:57:15 +00:00
Sam Parker
c5ddb4b803 [AArch64] Port SIMD narrowing to ISLE (#4478)
* [AArch64] Port SIMD narrowing to ISLE

Fvdemote, snarrow, unarrow and uunarrow.

Also refactor the aarch64 instructions descriptions to parameterize
on ScalarSize instead of using different opcodes.

The zero_value pure constructor has been introduced and used by the
integer narrow operations and it replaces, and extends, the compare
zero patterns.

Copright (c) 2022, Arm Limited.

* use short 'if' patterns
2022-07-25 12:40:36 -07:00
Damian Heaton
f1a0c40a53 Convert sqrt..nearest to ISLE (AArch64) (#4508)
Converted the existing implementations for the following opcodes to ISLE
on AArch64:
- `sqrt`
- `fneg`
- `fabs`
- `fpromote`
- `fdemote`
- `ceil`
- `floor`
- `trunc`
- `nearest`

Copyright (c) 2022 Arm Limited
2022-07-22 14:48:07 -07:00
Anton Kirilov
2ba4bce5cc Merge pull request from GHSA-7f6x-jwh5-m9r4
Copyright (c) 2022, Arm Limited.
2022-07-20 11:53:56 -05:00
Damian Heaton
00ac18c866 Convert fadd..fmax_pseudo to ISLE (AArch64) (#4452)
Converted the existing implementations for the following Opcodes to ISLE on AArch64:
- `fadd`
- `fsub`
- `fmul`
- `fdiv`
- `fmin`
- `fmax`
- `fmin_pseudo`
- `fmax_pseudo`

Copyright (c) 2022 Arm Limited
2022-07-19 12:03:05 -07:00
Damian Heaton
d792646677 Implement iabs in ISLE (AArch64) (#4399)
* Implement `iabs` in ISLE (AArch64)

Converts the existing implementation of `iabs` for AArch64 into ISLE,
and fixes support for `iabs` on scalar values.

Copyright (c) 2022 Arm Limited.

* Improve scalar `iabs` implementation.

Also introduces `CSNeg` instruction.

Copyright (c) 2022 Arm Limited
2022-07-18 11:12:34 -07:00
Damian Heaton
db7f9ccd2b Convert scalar_to_vector to ISLE (AArch64) (#4401)
* Convert `scalar_to_vector` to ISLE (AArch64)

Converted the exisiting implementation of `scalar_to_vector` for AArch64 to
ISLE.

Copyright (c) 2022 Arm Limited

* Add support for floats and fix FpuExtend

- Added rules to cover `f32 -> f32x4` and `f64 -> f64x2` for
`scalar_to_vector`
- Added tests for `scalar_to_vector` on floats.
- Corrected an invalid instruction emitted by `FpuExtend` on 64-bit
values.

Copyright (c) 2022 Arm Limited
2022-07-18 11:11:54 -07:00
Sam Parker
9c43749dfe [RFC] Dynamic Vector Support (#4200)
Introduce a new concept in the IR that allows a producer to create
dynamic vector types. An IR function can now contain global value(s)
that represent a dynamic scaling factor, for a given fixed-width
vector type. A dynamic type is then created by 'multiplying' the
corresponding global value with a fixed-width type. These new types
can be used just like the existing types and the type system has a
set of hard-coded dynamic types, such as I32X4XN, which the user
defined types map onto. The dynamic types are also used explicitly
to create dynamic stack slots, which have no set size like their
existing counterparts. New IR instructions are added to access these
new stack entities.

Currently, during codegen, the dynamic scaling factor has to be
lowered to a constant so the dynamic slots do eventually have a
compile-time known size, as do spill slots.

The current lowering for aarch64 just targets Neon, using a dynamic
scale of 1.

Copyright (c) 2022, Arm Limited.
2022-07-07 12:54:39 -07:00
Damian Heaton
6a5fe20956 Convert swizzle to ISLE (AArch64) (#4400)
Converted the implementation of `swizzle` for AArch64 to ISLE.

Copyright (c) 2022 Arm Limited
2022-07-07 10:29:33 -07:00
Sam Parker
d9e0e6a6a9 [AArch64] Port min/max to ISLE (#4374)
Copyright (c) 2022, Arm Limited.
2022-07-05 09:16:45 -07:00
Sam Parker
fb61774df2 [AArch64] Port AtomicLoad and AtomicStore to ISLE (#4301)
Copyright (c) 2022, Arm Limited.
2022-06-29 12:12:48 -07:00
Anton Kirilov
25a588c35f Cranelift AArch64: Use an allocated encoding for Udf (#4281)
Preserve the current behaviour when code is generated for SpiderMonkey.

Copyright (c) 2022, Arm Limited.
2022-06-22 15:03:28 +01:00
Sam Parker
acfeda4d80 [AArch64] Port IaddPairwise to ISLE (#4201)
Copyright (c) 2022, Arm Limited.
2022-06-06 15:37:13 +01:00
Sam Parker
010e028d67 [AArch64] Port AtomicCAS to isle (#4140)
Copyright (c) 2022, Arm Limited.
2022-05-25 09:19:24 +01:00
Anton Kirilov
edf07a8da6 Cranelift AArch64: Migrate Bitselect and Vselect to ISLE (#4139)
Copyright (c) 2022, Arm Limited.
2022-05-16 09:39:28 -07:00