Commit Graph

1368 Commits

Author SHA1 Message Date
Afonso Bordado
e85eb77c45 aarch64: Implement missing atomic rmw ops 2021-06-25 07:51:46 +01:00
Chris Fallin
652f21e3e0 Merge pull request #3026 from afonso360/aarch64-elf-tls
aarch64: Implement TLS ELF GD Relocations
2021-06-24 11:54:34 -07:00
Chris Fallin
7d47ba12c5 Merge pull request #3028 from cfallin/x86-legacy
cranelift-codegen: move old x86 and RISC-V backends to isa/legacy/.
2021-06-24 11:38:08 -07:00
Chris Fallin
4b2723abb0 cranelift-codegen: move old x86 and RISC-V backends to isa/legacy/.
These backends will be removed in the future (see
bytecodealliance/rfcs#12 and the pending #3009 in this repo).

In the meantime, to more clearly communicate that they are using
"legacy" APIs and will eventually be removed, this PR places them in an
`isa/legacy/` subdirectory. No functional changes otherwise.
2021-06-24 11:03:47 -07:00
Afonso Bordado
7a5948f729 aarch64: Implement lowering i128 select 2021-06-24 16:19:25 +01:00
Afonso Bordado
b8ad99e435 aarch64: Implement TLS ELF GD Relocations
Implement the `TlsValue` opcode in the aarch64 backend for ELF_GD.

This is a little bit unusual as the default TLS mechanism for aarch64 is TLS Descriptors in other compilers.
However currently we only recognize elf_gd so lets start with that as a TLS implementation.
2021-06-24 12:21:44 +01:00
Chris Fallin
b8c0ac72f1 Merge pull request #3012 from uweigand/s390x-addcarry
s390x: Basic support for IaddIfcout
2021-06-22 12:19:24 -07:00
Ulrich Weigand
3c678a7900 s390x: Basic support for IaddIfcout
This adds enough support for the IaddIfcout opcode to make the
code emitted by dynamic_addr work on s390x.

Note: On s390x, the condition code mask that has to be used to
implement unsigned_add_overflow_condition does not match any of
the masks for the "normal" condition codes, so this design is
not really a good match for s390x ...
2021-06-22 20:36:06 +02:00
Chris Fallin
4a6594c514 Merge pull request #3011 from cfallin/bint-x64
Fix `bint` on x64, and make `bextend` consistent with bool representation.
2021-06-22 11:26:20 -07:00
Chris Fallin
efe3930215 Fix bint on x64, and make bextend consistent with bool representation.
There has been occasional confusion with the representation that we use
for bool-typed values in registers, at least when these are wider than
one bit. Does a `b8` store `true` as 1, or as all-ones (`0xff`)?

We've settled on the latter because of some use-cases where the wide
bool becomes a mask -- see #2058 for more on this.

This is fine, and transparent, to most operations within CLIF, because
the bool-typed value still has only two semantically-visible states,
namely `true` and `false`.

However, we have to be careful with bool-to-int conversions. `bint` on
aarch64 correctly masked the all-ones value down to 0 or 1, as required
by the instruction specification, but on x64 it did not. This PR fixes
that bug and makes x64 consistent with aarch64.

While staring at this code I realized that `bextend` was also not
consistent with the all-ones invariant: it should do a sign-extend, not
a zero-extend as it previously did. This is also rectified and tested.
(Aarch64 also already had this case implemented correctly.)

Fixes #3003.
2021-06-22 10:56:56 -07:00
Chris Fallin
4b25e3e10a Merge pull request #3014 from uweigand/fix-srcloc
Fix updating srclocs in truncate_last_branch
2021-06-22 10:26:11 -07:00
Chris Fallin
fa1a04d002 Merge pull request #3005 from afonso360/aarch64-i128-extend
aarch64: Implement uextend/sextend for  i128 values
2021-06-22 10:24:30 -07:00
Ulrich Weigand
a90ab8a0cf Fix updating srclocs in truncate_last_branch
The truncate_last_branch removes an instruction that had already
been added to the buffer, and must update various bookkeeping.

However, updating the "srclocs" field is incorrect: if there is
a srclocs entry that spans both the removed branch *and some
previous instruction*, that whole srclocs entry is removed,
which makes those previous instructions now uncovered by any
srclocs record.  This can cause subsequent problems e.g. if
one of those instructions traps.

Fixed by just truncating instead of fully removing the srclocs
record in this case.
2021-06-22 13:53:47 +02:00
Afonso Bordado
f25f5b2732 aarch64: Implement lowering uextend/sextend for i128 values 2021-06-22 12:24:07 +01:00
Chris Fallin
18cd2f681c Merge pull request #3002 from afonso360/aarch64-i128-br
aarch64 implement brz,brnz,br_icmp for i128 values
2021-06-21 10:52:50 -07:00
Chris Fallin
443eb7a843 Merge pull request #3007 from bjorn3/hand_written_legalization
Use hand written legalizations in simple_legalize
2021-06-21 09:59:35 -07:00
Chris Fallin
444d9f9726 Merge pull request #3008 from afonso360/aarch64-i128-ireduce
aarch64: Implement ireduce for i128 values
2021-06-21 09:54:43 -07:00
Chris Fallin
4246e69c1c Merge pull request #3004 from afonso360/aarch64-i128-rotates
aarch64: Implement lowering rotl/rotr for i128 values
2021-06-21 09:53:57 -07:00
Afonso Bordado
151ad2f338 aarch64: Implement ireduce for i128 values 2021-06-20 19:04:45 +01:00
bjorn3
fa6e52848c Fix warnings 2021-06-20 18:52:48 +02:00
bjorn3
1415dd824a Remove all dsl legalizations for arm32, arm64 and s390x 2021-06-20 18:43:45 +02:00
bjorn3
50e53a0a73 Use hand written legalizations in simple_legalize
This allows removal of the legalization dsl once the old backends are
removed.
2021-06-20 18:43:38 +02:00
Afonso Bordado
f7f52445c8 aarch64: Implement lowering rotl/rotr for i128 values 2021-06-20 15:53:56 +01:00
Afonso Bordado
45faace329 aarch64: Implement i128 br_icmp
The previous commit deduplicated the icmp impl, so we reuse that
but make modifications where we don't need to set the results.
2021-06-19 22:01:33 +01:00
Afonso Bordado
b5708b4386 aarch64: Deduplicate lowering icmp
Lowering icmp was duplicated across callers that only cared about
flags, and callers that only cared about the bool result.

Merge both callers into `lower_icmp` which does the correct thing
depending on a new IcmpOutput parameter.
2021-06-19 08:32:37 +01:00
Anton Kirilov
b09b123a9e Enable the simd_i8x16_arith2 test for AArch64
Copyright (c) 2021, Arm Limited.
2021-06-18 14:09:52 +01:00
Afonso Bordado
a26be628bc aarch64: Implement lowering brz,brnz for i128 values 2021-06-18 00:21:54 +01:00
Chris Fallin
5ddf562309 Merge pull request #2991 from uweigand/s390x-z14
s390x: Add z14 support
2021-06-17 08:24:23 -07:00
Chris Fallin
de1edd4976 Merge pull request #2985 from afonso360/aarch64-i128-load-store
aarch64: Implement I128 Loads and Stores
2021-06-17 08:23:15 -07:00
Afonso Bordado
c82764605f aarch64: Add i128 load & store tests and refactor address calculation
The previous address calculation code had a bug where we tried to
add offsets into a temporary register before defining it, causing
the regalloc to complain.
2021-06-17 15:50:08 +01:00
Ulrich Weigand
def54fb1fa s390x: Add z14 support
* Add support for processor features (including auto-detection).

* Move base architecture set requirement back to z14.

* Add z15 feature sets and re-enable z15-specific code generation
  when required features are available.
2021-06-17 10:23:15 +02:00
Afonso Bordado
9fc89d2316 aarch64: Add bitrev,clz,cls,ctz for i128 values 2021-06-16 10:44:10 +01:00
Afonso Bordado
09fec151eb aarch64: Add popcnt for i128 values 2021-06-16 10:44:10 +01:00
Alex Crichton
5140fd251a Update wasm-tools crates (#2989)
* Update wasm-tools crates

This brings in recent updates, notably including more improvements to
wasm-smith which will hopefully help exercise non-trapping wasm more.

* Fix some wat
2021-06-15 22:56:10 -05:00
Ulrich Weigand
46b73431ca s390x: Add support for atomic operations (part 1)
This adds full back-end support for the Fence, AtomicLoad
and AtomicStore operations, and partial support for the
AtomicCas and AtomicRmw operations.

The missing pieces include sub-word operations, operations
on little-endian memory requiring byte-swapping, and some
of the subtypes of AtomicRmw -- everything that cannot be
implemented without a compare-and-swap loop.  This will be
done in a follow-up patch.

This patch already suffices to make the test suite green
again after a recent change that now requires atomic
operations when accessing the heap.
2021-06-15 17:12:11 +02:00
Afonso Bordado
1c05e06bd5 aarch64: Implement I128 Loads and Stores 2021-06-14 21:56:53 +01:00
Chris Fallin
3d56728b86 Merge pull request #2975 from afonso360/aarch64-icmp
aarch64: Implement lowering i128 icmp instructions
2021-06-09 15:38:41 -07:00
Afonso Bordado
2643d2654c aarch64: Implement lowering i128 icmp instructions
We have 3 different aproaches depending on the type of comparision requested:
* For eq/ne we compare the high bits and low bits and check
  if they are equal
* For overflow checks, we perform a i128 add and check the
  resulting overflow flag
* For the remaining comparisions (gt/lt/sgt/etc...)
  We compare both the low bits and high bits, and if the high bits are
  equal we return the result of the unsigned comparision on the low bits

As with other i128 ops, we are still missing immlogic support.
2021-06-09 23:02:55 +01:00
Afonso Bordado
4d085d8fbf aarch64: Add sbcs instruction encodings 2021-06-09 22:56:39 +01:00
Afonso Bordado
61f07d79a7 aarch64: Add adcs instruction encodings 2021-06-09 22:56:39 +01:00
Afonso Bordado
b1475f32a6 aarch64: Add ishl,ushr,sshr for i128 values 2021-06-09 22:48:14 +01:00
Afonso Bordado
2c4d1c0003 aarch64: Add ands instruction encoding 2021-06-09 22:38:01 +01:00
Afonso Bordado
c38a5e8b62 aarch64: Add basic i128 bit ops to the AArch64 backend
Currently we just basically use a two instruction version of the same i64 ops.
IMMLogic doesn't really support multiple register inputs, so its left as a TODO for future optimizations.
2021-06-09 22:37:55 +01:00
Alex Crichton
e8b8947956 Bump to 0.28.0 (#2972) 2021-06-09 14:00:13 -05:00
Chris Fallin
ffb92d9109 Merge pull request #2966 from akirilov-arm/simd_int_to_int_extend
Enable the simd_int_to_int_extend test for AArch64
2021-06-06 23:34:52 -07:00
Johnnie Birch
1770880e19 x64: add support for packed promote and demote (#2783)
* Add support for x64 packed promote low

* Add support for x64 packed floating point demote

* Update vector promote low and demote by adding constraints

Also does some renaming and minor refactoring
2021-06-04 15:59:20 -07:00
Anton Kirilov
5e8a8fe5a0 Enable the simd_int_to_int_extend test for AArch64
Copyright (c) 2021, Arm Limited.
2021-06-04 16:10:02 +01:00
Andrew Brown
8dc4cc9fe3 x64: fix AVX512 flag checks
Previously, the multiple flags for certain AVX512 instructions were
checked using `OR`: e.g., if the CPU has AVX512VL `OR` AVX512DQ,
emit `VPMULLQ`. This is incorrect--the logic should be `AND`. The Intel
Software Developer Manual, vol. 1, sec. 15.4, has more information on
this (notable there is the suggestion to check with `XGETBV` that the OS
is allowing the use of the XMM registers--but that is a separate issue).
This change switches to `AND` logic in the new backend.
2021-06-01 11:41:16 -07:00
Andrew Brown
2a9f458ea3 x64: lower i8x16.shuffle to VPERMI2B when possible
When shuffling values from two different registers, the x64 lowering for
`i8x16.shuffle` must first shuffle each register separately and then OR
the results with SSE instructions. With `VPERMI2B`, available in
AVX512VL + AVX512VBMI, this can be done in a single instruction after
the shuffle mask has been moved into the destination register. This
change uses `VPERMI2B` for that case when the CPU supports it.
2021-06-01 11:40:53 -07:00
Benjamin Bouvier
51edea9e57 cranelift: introduce a new WasmtimeAppleAarch64 calling convention
The previous choice to use the WasmtimeSystemV calling convention for
apple-aarch64 devices was incorrect: padding of arguments was
incorrectly computed. So we have to use some flavor of the apple-aarch64
ABI there.

Since we want to support the wasmtime custom convention for multiple
returns on apple-aarch64 too, a new custom Wasmtime calling convention
was introduced to support this.
2021-06-01 17:29:12 +02:00