Commit Graph

282 Commits

Author SHA1 Message Date
Afonso Bordado
3a38400447 aarch64: Refactor lower_icmp to use a single materialize_bool_result 2021-07-19 09:31:14 -07:00
Afonso Bordado
14d1c7ee9f aarch64: Refactor lower_icmp to allow returning a different flag 2021-07-19 09:31:14 -07:00
Afonso Bordado
e628fb376f aarch64: Fix incorrect code generation for overflow icmp in i16 values 2021-07-19 09:31:14 -07:00
Afonso Bordado
db5566dadb aarch64: Fix lowering amounts for shifts
This commit addresses two issues:
* A panic when shifting any non i128 type by i128 amounts (#3064)
* Wrong results when lowering shifts with small types (i8, i16)

In these types when shifting for amounts larger than the size of the
type, we would not get the wrapping behaviour that we see on i32 and i64.
This is because in these larger types, the wrapping behaviour is automatically
implemented by using the appropriate instruction, however we do not
have i8 and i16 specific instructions, so we have to manually wrap
the shift amount with an AND instruction.

This issue is also found on x86_64 and s390x, and a separate issue will
be filed for those.

Closes #3064
2021-07-16 22:08:02 +01:00
Anton Kirilov
6c3d7092b9 Enable the simd_conversions test for AArch64
Copyright (c) 2021, Arm Limited.
2021-07-16 22:04:45 +01:00
Johnnie Birch
d8e813204e Fold fcvt_low_from_uinit into previously existing clif instructions 2021-07-09 10:39:05 -07:00
Johnnie Birch
2d676d838f Implements f64x2.convert_low_i32x4_u for x64 2021-07-09 10:39:05 -07:00
Afonso Bordado
eebae8d4c8 aarch64: Fix incorrect encoding of large const values in icmp.
When encoding constants as immediates into an RSE Imm12 instruction we need to take special care to check if the value that we are trying to input does not overflow its type when viewed as a signed value. (i.e. iconst.i8 200)

We cannot both put an immediate and sign extend it, so we need to lower it into a separate reg, and emit the sign extend into the instruction.

For more details see the [cg_clif bug report](https://github.com/bjorn3/rustc_codegen_cranelift/issues/1184#issuecomment-873214796).
2021-07-03 22:42:15 +01:00
Anton Kirilov
330f02aa09 Enable the simd_i32x4_trunc_sat_f64x2 test for AArch64
Also, reorganize the AArch64-specific VCode instructions for unary
narrowing and widening vector operations, so that they are more
straightforward to use.

Copyright (c) 2021, Arm Limited.
2021-06-30 12:17:53 +01:00
Anton Kirilov
98f1ac789e Enable the simd_i16x8_q15mulr_sat_s test on AArch64
Copyright (c) 2021, Arm Limited.
2021-06-28 12:24:31 +01:00
Afonso Bordado
e85eb77c45 aarch64: Implement missing atomic rmw ops 2021-06-25 07:51:46 +01:00
Chris Fallin
652f21e3e0 Merge pull request #3026 from afonso360/aarch64-elf-tls
aarch64: Implement TLS ELF GD Relocations
2021-06-24 11:54:34 -07:00
Afonso Bordado
7a5948f729 aarch64: Implement lowering i128 select 2021-06-24 16:19:25 +01:00
Afonso Bordado
b8ad99e435 aarch64: Implement TLS ELF GD Relocations
Implement the `TlsValue` opcode in the aarch64 backend for ELF_GD.

This is a little bit unusual as the default TLS mechanism for aarch64 is TLS Descriptors in other compilers.
However currently we only recognize elf_gd so lets start with that as a TLS implementation.
2021-06-24 12:21:44 +01:00
Chris Fallin
fa1a04d002 Merge pull request #3005 from afonso360/aarch64-i128-extend
aarch64: Implement uextend/sextend for  i128 values
2021-06-22 10:24:30 -07:00
Afonso Bordado
f25f5b2732 aarch64: Implement lowering uextend/sextend for i128 values 2021-06-22 12:24:07 +01:00
Chris Fallin
18cd2f681c Merge pull request #3002 from afonso360/aarch64-i128-br
aarch64 implement brz,brnz,br_icmp for i128 values
2021-06-21 10:52:50 -07:00
Chris Fallin
444d9f9726 Merge pull request #3008 from afonso360/aarch64-i128-ireduce
aarch64: Implement ireduce for i128 values
2021-06-21 09:54:43 -07:00
Chris Fallin
4246e69c1c Merge pull request #3004 from afonso360/aarch64-i128-rotates
aarch64: Implement lowering rotl/rotr for i128 values
2021-06-21 09:53:57 -07:00
Afonso Bordado
151ad2f338 aarch64: Implement ireduce for i128 values 2021-06-20 19:04:45 +01:00
Afonso Bordado
f7f52445c8 aarch64: Implement lowering rotl/rotr for i128 values 2021-06-20 15:53:56 +01:00
Afonso Bordado
45faace329 aarch64: Implement i128 br_icmp
The previous commit deduplicated the icmp impl, so we reuse that
but make modifications where we don't need to set the results.
2021-06-19 22:01:33 +01:00
Afonso Bordado
b5708b4386 aarch64: Deduplicate lowering icmp
Lowering icmp was duplicated across callers that only cared about
flags, and callers that only cared about the bool result.

Merge both callers into `lower_icmp` which does the correct thing
depending on a new IcmpOutput parameter.
2021-06-19 08:32:37 +01:00
Anton Kirilov
b09b123a9e Enable the simd_i8x16_arith2 test for AArch64
Copyright (c) 2021, Arm Limited.
2021-06-18 14:09:52 +01:00
Afonso Bordado
a26be628bc aarch64: Implement lowering brz,brnz for i128 values 2021-06-18 00:21:54 +01:00
Chris Fallin
de1edd4976 Merge pull request #2985 from afonso360/aarch64-i128-load-store
aarch64: Implement I128 Loads and Stores
2021-06-17 08:23:15 -07:00
Afonso Bordado
c82764605f aarch64: Add i128 load & store tests and refactor address calculation
The previous address calculation code had a bug where we tried to
add offsets into a temporary register before defining it, causing
the regalloc to complain.
2021-06-17 15:50:08 +01:00
Afonso Bordado
9fc89d2316 aarch64: Add bitrev,clz,cls,ctz for i128 values 2021-06-16 10:44:10 +01:00
Afonso Bordado
09fec151eb aarch64: Add popcnt for i128 values 2021-06-16 10:44:10 +01:00
Afonso Bordado
1c05e06bd5 aarch64: Implement I128 Loads and Stores 2021-06-14 21:56:53 +01:00
Chris Fallin
3d56728b86 Merge pull request #2975 from afonso360/aarch64-icmp
aarch64: Implement lowering i128 icmp instructions
2021-06-09 15:38:41 -07:00
Afonso Bordado
2643d2654c aarch64: Implement lowering i128 icmp instructions
We have 3 different aproaches depending on the type of comparision requested:
* For eq/ne we compare the high bits and low bits and check
  if they are equal
* For overflow checks, we perform a i128 add and check the
  resulting overflow flag
* For the remaining comparisions (gt/lt/sgt/etc...)
  We compare both the low bits and high bits, and if the high bits are
  equal we return the result of the unsigned comparision on the low bits

As with other i128 ops, we are still missing immlogic support.
2021-06-09 23:02:55 +01:00
Afonso Bordado
4d085d8fbf aarch64: Add sbcs instruction encodings 2021-06-09 22:56:39 +01:00
Afonso Bordado
61f07d79a7 aarch64: Add adcs instruction encodings 2021-06-09 22:56:39 +01:00
Afonso Bordado
b1475f32a6 aarch64: Add ishl,ushr,sshr for i128 values 2021-06-09 22:48:14 +01:00
Afonso Bordado
2c4d1c0003 aarch64: Add ands instruction encoding 2021-06-09 22:38:01 +01:00
Afonso Bordado
c38a5e8b62 aarch64: Add basic i128 bit ops to the AArch64 backend
Currently we just basically use a two instruction version of the same i64 ops.
IMMLogic doesn't really support multiple register inputs, so its left as a TODO for future optimizations.
2021-06-09 22:37:55 +01:00
Chris Fallin
ffb92d9109 Merge pull request #2966 from akirilov-arm/simd_int_to_int_extend
Enable the simd_int_to_int_extend test for AArch64
2021-06-06 23:34:52 -07:00
Johnnie Birch
1770880e19 x64: add support for packed promote and demote (#2783)
* Add support for x64 packed promote low

* Add support for x64 packed floating point demote

* Update vector promote low and demote by adding constraints

Also does some renaming and minor refactoring
2021-06-04 15:59:20 -07:00
Anton Kirilov
5e8a8fe5a0 Enable the simd_int_to_int_extend test for AArch64
Copyright (c) 2021, Arm Limited.
2021-06-04 16:10:02 +01:00
Benjamin Bouvier
51edea9e57 cranelift: introduce a new WasmtimeAppleAarch64 calling convention
The previous choice to use the WasmtimeSystemV calling convention for
apple-aarch64 devices was incorrect: padding of arguments was
incorrectly computed. So we have to use some flavor of the apple-aarch64
ABI there.

Since we want to support the wasmtime custom convention for multiple
returns on apple-aarch64 too, a new custom Wasmtime calling convention
was introduced to support this.
2021-06-01 17:29:12 +02:00
Chris Fallin
f2fe0c669e Merge pull request #2929 from cfallin/bb-offsets
Provide BB layout info externally in terms of code offsets.
2021-05-24 14:27:53 -07:00
Afonso Bordado
4ddbfe50ba aarch64: Implement imul for i128 operands 2021-05-24 18:23:30 +01:00
Chris Fallin
11a2ef01e7 Provide BB layout info externally in terms of code offsets.
This is sometimes useful when performing analyses on the generated
machine code: for example, some kinds of code verifiers will want to do
a control-flow analysis, and it is much easier to do this if one does
not have to recover the CFG from the machine code (doing so requires
heavyweight analysis when indirect branches are involved). If one trusts
the control-flow lowering and only needs to verify other properties of
the code, this can be very useful.
2021-05-24 09:18:06 -07:00
Afonso Bordado
a2e74b2c45 aarch64: Implement isub for i128 operands 2021-05-22 21:51:41 +01:00
Afonso Bordado
d3b525fa29 aarch64: Implement iadd for i128 operands 2021-05-22 21:21:44 +01:00
Chris Fallin
65e0e20210 Merge pull request #2892 from afonso360/aarch64-multireg-args
Handle i128 arguments in the aarch64 ABI
2021-05-21 16:57:42 -07:00
Afonso Bordado
fbcfffdeab Handle spilling i128 arguments into the stack in aarch64 2021-05-21 17:05:41 +01:00
Andrew Brown
84b6f05971 cranelift: remove unreachable scalar lowerings of saturating arithmetic
Since `uadd_sat`, `sadd_sat`, `usub_sat`, and `ssub_sat` are now only
available to vector types, this removes the lowering code for the
scalar versions of these instructions in the arm32 and aarch64 backends.
2021-05-17 06:54:45 -07:00
Afonso Bordado
ac624da8d9 Handle i128 arguments in the aarch64 ABI
When dealing with params that need to be split, we follow the
arch64 ABI and split the value in two, and make sure that start that
argument in an even numbered xN register.

The apple ABI does not require this, so on those platforms, we start
params anywhere.
2021-05-12 13:06:13 +01:00