Commit Graph

154 Commits

Author SHA1 Message Date
Afonso Bordado
925891245d cranelift: Fix fmin/fmax when dealing with zeroes (#4373)
`fmin`/`fmax` are defined as returning -0.0 as smaller than 0.0.
This is not how the IEEE754 views these values and the interpreter was
returning the wrong value in these operations since it was just using the
standard IEEE754 comparisons.

This also tries to preserve NaN information by avoiding passing NaN's
through any operation that could canonicalize it.
2022-07-05 12:59:23 -07:00
Afonso Bordado
e91f493ff5 cranelift: Add heap support to the interpreter (#3302)
* cranelift: Add heaps to interpreter

* cranelift: Add RunTest Environment mechanism to  test interpret

* cranelift: Remove unused `MemoryError`

* cranelift: Add docs for `State::resolve_global_value`

* cranelift: Rename heap tests

* cranelift: Refactor heap address resolution

* Fix typos and clarify docs (thanks @cfallin)
2022-07-05 09:05:26 -07:00
Afonso Bordado
2003ae99a0 Implement fma/fabs/fneg/fcopysign on the interpreter (#4367)
* cranelift: Implement `fma` on interpreter

* cranelift: Implement `fabs` on interpreter

* cranelift: Fix `fneg` implementation on interpreter

`fneg` was implemented as `0 - x` which is not correct according to the
standard since that operation makes no guarantees on what the output
is when the input is `NaN`. However for `fneg` the output for `NaN`
inputs is fully defined.

* cranelift: Implement `fcopysign` on interpreter
2022-07-05 09:03:04 -07:00
Afonso Bordado
f2e6ff5e70 cranelift: Implement sqrt in interpreter (#4362)
This ignores SIMD for now.
2022-07-01 09:39:11 -07:00
Afonso Bordado
23ae9016af cranelift: Implement scalar ireduce on interpreter (#4320) 2022-06-27 11:00:37 -07:00
Afonso Bordado
87007c5839 cranelift: Fix bint implementation on interpreter (#4299)
* cranelift: Fix `bint` implementation on interpreter

The interpreter was returning -1 instead of 1 for positive values.
This also extends the bint test suite to cover all types.

* cranelift: Restrict `bint` to scalar values only
2022-06-23 13:43:35 -07:00
Andrew Brown
bd6fe11ca9 cranelift: remove load_complex and store_complex (#3976)
This change removes all variants of `load*_complex` and `store*_complex`
from Cranelift; this is a breaking change to the instructions exposed by
CLIF. The complete list of instructions removed is: `load_complex`,
`store_complex`, `uload8_complex`, `sload8_complex`, `istore8_complex`,
`sload8_complex`, `uload16_complex`, `sload16_complex`,
`istore16_complex`, `uload32_complex`, `sload32_complex`,
`istore32_complex`, `uload8x8_complex`, `sload8x8_complex`,
`sload16x4_complex`, `uload16x4_complex`, `uload32x2_complex`,
`sload32x2_complex`.

The rationale for this removal is that the Cranelift backend now has the
ability to pattern-match multiple upstream additions in order to
calculate the address to access. Previously, this was not possible so
the `*_complex` instructions were needed. Over time, these instructions
have fallen out of use in this repository, making the additional
overhead of maintaining them a chore.
2022-03-31 10:05:10 -07:00
Damian Heaton
6c8c94723a Scalar values in vectorizelanes & extractlanes (#3922)
- `extractlanes` will now function on a scalar value, returning the
value as a single-element array.
- `vectorizelanes` will accept a single-element array, returning the
contained value.

Existing `if !x.is_vector()` code-patterns have been simplified as a
result.

Copyright (c) 2022 Arm Limited
2022-03-28 09:32:59 -07:00
Chris Fallin
5e96a447f0 Add back the ifcmp_sp CLIF opcode.
This opcode was removed as part of the old-backend cleanup in #3446.
While this opcode will definitely go away eventually, it is
unfortunately still used today in Lucet (as we just discovered while
working to upgrade Lucet's pinned Cranelift version). Lucet is
deprecated and slated to eventually be completely sunset in favor of
Wasmtime; but until that happens, we need to keep this opcode.
2021-11-01 13:34:31 -07:00
bjorn3
86d2ef8952 Fix CI 2021-11-01 18:19:59 +01:00
bjorn3
a05bf2bf42 Remove instructions necessary for the old regalloc 2021-10-12 14:37:36 +02:00
bjorn3
1fd491dadd Remove fallthrough instruction 2021-10-12 14:22:07 +02:00
bjorn3
5b24e117ee Remove instructions used by old br_table legalization 2021-10-12 14:18:52 +02:00
bjorn3
8a8797b911 Remove the sarg_t type and dummy_sarg_t instruction
They are no longer necessary with the new style backends
2021-10-10 14:38:35 +02:00
Benjamin Bouvier
43a86f14d5 Remove more old backend ISA concepts (#3402)
This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate.

Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!
2021-10-04 10:36:12 +02:00
bjorn3
9e34df33b9 Remove the old x86 backend 2021-09-29 16:13:46 +02:00
Chris Fallin
65fde3a86b Merge pull request #3380 from dheaton-arm/implement-iabs
Implement `Iabs` for the interpreter
2021-09-22 10:00:53 -07:00
Chris Fallin
b076c99af9 Merge pull request #3379 from dheaton-arm/implement-sqmulroundsat
Implement `SqmulRoundSat` for interpreter
2021-09-22 09:59:13 -07:00
Chris Fallin
dd7310df04 Merge pull request #3361 from dheaton-arm/implement-vecops
Implement `VhighBits` & `Vselect` for interpreter
2021-09-22 09:22:52 -07:00
dheaton-arm
cb30ecc7bc Implement Iabs for the interpreter
Implemented `Iabs` to return the absolute integer value with wrapping.

Copyright (c) 2021, Arm Limited
2021-09-22 12:59:30 +01:00
dheaton-arm
02ff19f2fc Implement SqmulRoundSat for interpreter
Implemented `SqmulRoundSat` for the Cranelift interpreter, performing
QN-format fixed point multiplication for 16 and 32-bit integers in
SIMD vectors.

Copyright (c) 2021, Arm Limited
2021-09-22 12:58:41 +01:00
Afonso Bordado
9a95ce75f1 cranelift: Add bmask to interpreter 2021-09-21 18:43:53 +01:00
Chris Fallin
38728c5746 Merge pull request #3362 from dheaton-arm/implement-unarrow
Implement `Unarrow`, `Uunarrow`, and `Snarrow` for the interpreter
2021-09-21 10:06:46 -07:00
Chris Fallin
e0bd4bd007 Merge pull request #3363 from dheaton-arm/implement-widening-pairwise-dotprod
Implement `WideningPairwiseDotProductS` for interpreter
2021-09-21 10:05:07 -07:00
dheaton-arm
8abb19cbd8 Generate new_vec using an iterator chain
Copyright (c) 2021, Arm Limited
2021-09-20 10:31:34 +01:00
dheaton-arm
3fc29f5f6c Return u128 from bounds; form new_vec from iter chain
Copyright (c) 2021, Arm Limited
2021-09-20 09:57:19 +01:00
Chris Fallin
6a98fe2104 Merge pull request #3332 from afonso360/interp-icmp
cranelift: Add SIMD `icmp` to interpreter
2021-09-17 15:13:44 -07:00
Afonso Bordado
e17d9cfbab cranelift: Rename icmp type variable 2021-09-17 22:17:54 +01:00
dheaton-arm
2f0ce4c86c Implement Smulhi for interpreter
Implemented `Smulhi` for the Cranelift interpreter, performing signed
integer multiplication and producing the high half of a double-length
result.

Copyright (c) 2021, Arm Limited
2021-09-17 16:49:38 +01:00
dheaton-arm
3b9bfc8187 Implement WideningPairwiseDotProductS for interpreter
Implemented `WideningPairwiseDotProductS` to perform sign-extending
length-doubling multiplication on corresponding elements from two
`i16x8` SIMD vectors, performing a pairwise add on the results (thus
returning `i32x4`).

Copyright (c) 2021, Arm Limited
2021-09-17 13:31:16 +01:00
dheaton-arm
83c3bc5b9d Implement Unarrow, Uunarrow, and Snarrow for the interpreter
Implemented the following Opcodes for the Cranelift interpreter:
- `Unarrow` to combine two SIMD vectors into a new vector with twice
the lanes but half the width, with signed inputs which are clamped to
`0x00`.
- `Uunarrow` to perform the same operation as `Unarrow` but treating
inputs as unsigned.
- `Snarrow` to perform the same operation as `Unarrow` but treating
both inputs and outputs as signed, and saturating accordingly.

Note that all 3 instructions saturate at the type boundaries.

Copyright (c) 2021, Arm Limited
2021-09-17 13:26:10 +01:00
dheaton-arm
224a4b4094 Implement VhighBits & Vselect for interpreter
Implemented the following Opcodes for the Cranelift interpreter:
- `VhighBits` to reduce a vector to a scalar integer formed by
concatenating the MSB of each lane.
- `Vselect` to select lanes from two vectors controlled by a boolean
vector.

Copyright (c) 2021, Arm Limited
2021-09-17 11:54:58 +01:00
Chris Fallin
2412e8d784 Merge pull request #3317 from dheaton-arm/implement-swiden
Implement `SwidenLow` and `SwidenHigh` for the interpreter
2021-09-14 08:57:57 -07:00
dheaton-arm
99cc95d630 Factor out shared logic for widening ops.
Copyright (c) 2021, Arm Limited
2021-09-14 13:08:35 +01:00
dheaton-arm
a595bd22e3 Replace loops with iterator methods.
Copyright (c) 2021, Arm Limited
2021-09-14 12:37:36 +01:00
dheaton-arm
75ef00f1fd Implement SwidenLow and SwidenHigh for the interpreter
Implemented `SwidenLow` and `SwidenHigh` for the Cranelift interpreter,
doubling the width and halving the number of lanes preserving the low
and high halves respectively.

Conversions are performed using signed extension.

Copyright (c) 2021, Arm Limited
2021-09-14 12:37:36 +01:00
Chris Fallin
7421e1a65b Merge pull request #3324 from dheaton-arm/implement-shuffle
Implement `Shuffle` for the interpreter
2021-09-13 09:49:59 -07:00
Chris Fallin
9323762d71 Merge pull request #3314 from dheaton-arm/implement-bitops
Implement bit operations for Cranelift interpreter
2021-09-13 09:29:10 -07:00
Afonso Bordado
92690b84a0 cranelift: Add SIMD icmp comparisons to interpreter 2021-09-11 17:15:44 +01:00
Afonso Bordado
f48e40f150 cranelift: Implement icmp for scalar types
Add `icmp` tests for all scalar types and condition codes.

AArch64 (no)overflow tests are disabled because they are currently failing.
2021-09-11 17:15:44 +01:00
dheaton-arm
4a4f940fac Move immediate value retrieval to imm
Copyright (c) 2021, Arm Limited
2021-09-10 12:36:33 +01:00
dheaton-arm
e7d570ddd9 Collect into Result rather than unwrap
Copyright (c) 2021, Arm Limited
2021-09-10 12:26:48 +01:00
dheaton-arm
924b0368e9 Rewrite as iterator methods
Copyright (c) 2021, Arm Limited
2021-09-10 09:41:23 +01:00
dheaton-arm
f7a1b3f9bd Implement UwidenLow and UwidenHigh for the interpreter
Implemented `UwidenLow` and `UwidenHigh` for the Cranelift interpreter,
doubling the width and halving the number of lanes preserving the low
and high halves respectively. Conversions are performed using unsigned
zero extension.

Copyright (c) 2021, Arm Limited
2021-09-08 14:17:11 +01:00
dheaton-arm
dfe1c914ea Cast types back to expected in macros
Also neatened `popcnt` a little following feedback.

Copyright (c) 2021, Arm Limited
2021-09-08 12:36:01 +01:00
dheaton-arm
bca3cb32ef Implement Shuffle for the interpreter
Implemented `Shuffle` for the Cranelift interpreter, to shuffle two SIMD
vectors together based on an immediate mask of 16 bytes.

Copyright (c) 2021, Arm Limited
2021-09-08 11:13:57 +01:00
dheaton-arm
9f647301ff Implement bit operations for Cranelift interpreter
Implemented for the Cranelift interpreter:
- `Bitrev` to reverse the order of the bits in an integer.
- `Cls` to count the leading bits which are the same as the sign bit in
an integer, yielding one less than the size of the integer for 0 and -1.
- `Clz` to count the number of leading zeros in the bitwise representation of the
integer.
- `Ctz` to count the number of trailing zeros in the bitwise representation of the
integer.
- `Popcnt` to count the number of ones in the bitwise representation of the
integer.

Copyright (c) 2021, Arm Limited
2021-09-08 11:07:22 +01:00
Afonso Bordado
3f62ef6e58 cranelift: Fix Build error
#3304 and #3268 are slightly incomptible and caused the build to fail
when they were merged together
2021-09-07 18:13:45 +01:00
Damian Heaton
dd23a21b9b Implement Swizzle and Splat for interpreter (#3268)
* Implement `Swizzle` and `Splat` for interpreter

Implemented for the Cranelift interpreter:
- `Swizzle` to shuffle an `i8x16` SIMD vector based
on the indices specified in another vector of the same size.
- `Splat` to create a SIMD vector with all lanes having the same value.

Copyright (c) 2021, Arm Limited

* Fix old x86 backend failing test

Copyright (c) 2021, Arm Limited

* Represent i16x8 and above as hex

Copyright (c) 2021, Arm Limited
2021-09-07 09:53:49 -07:00
Afonso Bordado
63e9a81deb Implement vany_true and vall_true instructions in interpreter (#3304)
* cranelift: Implement ZeroExtend for a bunch of types in interpreter

* cranelift: Implement VConst on interpreter

* cranelift: Implement VallTrue on interpreter

* cranelift: Implement VanyTrue on interpreter

* cranelift: Mark `v{all,any}_true` tests as machinst only

* cranelift: Disable `vany_true` tests on aarch64

The `b64x2` case produces an illegal instruction. See #3305
2021-09-07 09:50:39 -07:00