Commit Graph

46 Commits

Author SHA1 Message Date
Afonso Bordado
60e4a00413 riscv64: Initial SIMD Vector Implementation (#6240)
* riscv64: Remove unused code

* riscv64: Add vector types

* riscv64: Initial Vector ABI Load/Stores

* riscv64: Vector Loads/Stores

* riscv64: Fix `vsetvli` encoding error

* riscv64: Add SIMD `iadd` runtests

* riscv64: Rename `VecSew`

The SEW name is correct, but only for VType. We also use this type
in loads/stores as the Efective Element Width, so the name isn't
quite correct in that case.

* ci: Add V extension to RISC-V QEMU

* riscv64: Misc Cleanups

* riscv64: Check V extension in `load`/`store` for SIMD

* riscv64: Fix `sumop` doc comment

* cranelift: Fix comment typo

* riscv64: Add convert for VType and VecElementWidth

* riscv64: Remove VecElementWidth converter
2023-04-20 21:54:43 +00:00
Trevor Elliott
f89ac63766 riscv64: Remove the gen_move2 helper (#6246)
* Remove gen_move2 from riscv64

* Update exp files
2023-04-19 21:04:30 +00:00
T0b1-iOS
569089e473 Add {u,s}{add,sub,mul}_overflow instructions (#5784)
* add `{u,s}{add,sub,mul}_overflow` with interpreter

* add `{u,s}{add,sub,mul}_overflow` for x64

* add `{u,s}{add,sub,mul}_overflow` for aarch64

* 128bit filetests for `{u,s}{add,sub,mul}_overflow`

* `{u,s}{add,sub,mul}_overflow` emit tests for x64

* `{u,s}{add,sub,mul}_overflow` emit tests for aarch64

* Initial review changes

* add `with_flags_extended` helper

* add `with_flags_chained` helper
2023-04-11 20:16:04 +00:00
Afonso Bordado
4c32dd7786 riscv64: Delete SelectIf instruction (#5888)
* riscv64: Delete `SelectIf` instruction

* riscv64: Fix typo in comment

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

* riscv64: Improve `bmask` codegen

* riscv64: Use `lower_bmask` in `select_spectre_guard`

* riscv64: Use `lower_bmask` to extend values in `select_spectre_guard`

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

---------

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
2023-04-11 17:33:32 +00:00
Alex Crichton
967543eb43 aarch64: Add more lowerings for the CLIF fma (#6150)
This commit adds new lowerings to the AArch64 backend of the
element-based `fmla` and `fmls` instructions. These instructions have
one of the multiplicands as an implicit broadcast of a single lane of
another register and can help remove `shuffle` or `dup` instructions
that would otherwise be used to implement them.
2023-04-05 17:22:55 +00:00
Afonso Bordado
a002a2cc5e riscv64: Add instruction helpers (#6099)
* riscv64: Add helpers for `add`

* riscv64: Add helpers for `sub`

* riscv64: Add helpers for `sll`

* riscv64: Add helpers for `srl`

* riscv64: Add helpers for `sra`

* riscv64: Add helpers for `or`

* riscv64: Add helpers for `and`

* riscv64: Add helpers for `xor`

* riscv64: Add helpers for `addi`

* riscv64: Add helpers for `slli`

* riscv64: Add helpers for `srli`

* riscv64: Add helpers for `srai`

* riscv64: Add helpers for `ori`

* riscv64: Add helpers for `xori`

* riscv64: Add helpers for `andi`

* riscv64: Add helpers for `not`

* riscv64: Add helpers for `sltiu`

* riscv64: Add helpers for `seqz`

* riscv64: Add helpers for `addw`

* riscv64: Add helpers for `subw`

* riscv64: Add helpers for `sllw`

* riscv64: Add helpers for `slliw`

* riscv64: Add helpers for `srlw`

* riscv64: Add helpers for `srliw`

* riscv64: Add helpers for `sraw`

* riscv64: Add helpers for `sraiw`

* riscv64: Add helpers for `sltu`

* riscv64: Add helpers for `mul`

* riscv64: Add helpers for `mulh`

* riscv64: Add helpers for `mulhu`

* riscv64: Add helpers for `div`

* riscv64: Add helpers for `divu`

* riscv64: Add helpers for `rem`

* riscv64: Add helpers for `remu`

* riscv64: Add helpers for `mulw`

* riscv64: Add helpers for `divw`

* riscv64: Add helpers for `divuw`

* riscv64: Add helpers for `remw`

* riscv64: Add helpers for `remuw`

* riscv64: Add helpers for `neg`

* riscv64: Add helpers for `addiw`

* riscv64: Add helpers for `sext.w`

* riscv64: Add helpers for `fadd`

* riscv64: Add helpers for `fsub`

* riscv64: Add helpers for `fmul`

* riscv64: Add helpers for `fdiv`

* riscv64: Add helpers for `fsqrt`

* riscv64: Add helpers for `fmadd`

* riscv64: Add helpers for `fsgnj`

* riscv64: Add helpers for `fsgnjn`

* riscv64: Add helpers for `fsgnjx`

* riscv64: Add helpers for `fcvtds`

* riscv64: Add helpers for `fcvtsd`

* riscv64: Add helpers for `adduw`

* riscv64: Add helpers for `zext.w`

* riscv64: Add helpers for `andn`

* riscv64: Add helpers for `orn`

* riscv64: Add helpers for `clz`

* riscv64: Add helpers for `clzw`

* riscv64: Add helpers for `ctz`

* riscv64: Add helpers for `ctzw`

* riscv64: Add helpers for `cpop`

* riscv64: Add helpers for `max`

* riscv64: Add helpers for `feq`

* riscv64: Add helpers for `flt`

* riscv64: Add helpers for `fle`

* riscv64: Add helpers for `fgt`

* riscv64: Add helpers for `fge`

* riscv64: Add helpers for `sext.b`

* riscv64: Add helpers for `sext.h`

* riscv64: Add helpers for `zext.h`

* riscv64: Add helpers for `rol`

* riscv64: Add helpers for `rolw`

* riscv64: Add helpers for `ror`

* riscv64: Add helpers for `rorw`

* riscv64: Add helpers for `rev8`

* riscv64: Add helpers for `brev8`

* riscv64: Add helpers for `bseti`

* riscv64: Add helpers for `pack`

* riscv64: Add helpers for `packw`

* riscv64: Add helpers for `slli.uw`

* riscv64: Add helpers for `fabs`

* riscv64: Add helpers for `fneg`
2023-03-24 18:01:04 +00:00
Afonso Bordado
602ff71fe4 riscv64: Add Zba extension instructions (#6087)
* riscv64: Use `add.uw` to zero extend

* riscv64: Implement `add.uw` optimizations

* riscv64: Add `Zba` `iadd+ishl` optimizations

* riscv64: Add `shl+uextend` optimizations based on `Zba`

* riscv64: Fix some issues with `Zba` instructions

* riscv64: Restrict shnadd selection

* riscv64: Fix `extend` priorities

* riscv64: Remove redundant `addw` rule

* riscv64: Specify type for `add` extend rules

* riscv64: Use `u64_from_imm64` extractor instead of `uimm8`

* riscv64: Restrict `uextend` in `shnadd.uw` rules

* riscv64: Use concrete type in `slli.uw` rule

* riscv64: Add extra arithmetic extends tests

Co-authored-by: Jamey Sharp <jsharp@fastly.com>

* riscv64: Make `Adduw` types concrete

* riscv64: Add extra arithmetic extend tests

* riscv64: Add `sextend`+Arithmetic rules

* riscv64: Fix whitespace

* cranelift: Move arithmetic extends tests with i128 to separate file

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-03-23 20:06:03 +00:00
Afonso Bordado
7a3df7dcc0 riscv64: Improve ctz/clz/cls codegen (#5854)
* cranelift: Add extra runtests for `clz`/`ctz`

* riscv64: Restrict lowering rules for `ctz`/`clz`

* cranelift: Add `u64` isle helpers

* riscv64: Improve `ctz` codegen

* riscv64: Improve `clz` codegen

* riscv64: Improve `cls` codegen

* riscv64: Improve `clz.i128` codegen

Instead of checking if we have 64 zeros in the top half. Check
if it *is* 0, that way we avoid loading the `64` constant.

* riscv64: Improve `ctz.i128` codegen

Instead of checking if we have 64 zeros in the bottom half. Check
if it *is* 0, that way we avoid loading the `64` constant.

* riscv64: Use extended value in `lower_cls`

* riscv64: Use pattern matches on `bseti`
2023-03-21 23:15:14 +00:00
Afonso Bordado
5c95e6fbaf riscv64: Codemotion cleanups to ISLE files (#5984)
* riscv64: Fix typo in extensions

* riscv64: Move converters to top of file

* riscv64: Group up all imm12 rules

* riscv64: Move zero_reg helpers to Physical Regs section

* riscv64: Move helpers away from `clz` lowerings

These were in the middle of the `clz` rules and are kind of distracting

* riscv64: Move `cls` rules next to `ctz`/`clz`

* cranelift: Move `u8_and` / `u32_add` to Primitive Arithmetic section

* riscv64: Mark some imm12 constructors as pure

* cranelift: Move `s32_add_fallible` next to `u32_add`

* riscv64: Fix Typo
2023-03-13 19:20:15 +00:00
yuyang
4e875f33a7 Codegen fix fcvt_from_sint.f32 with small types on riscv64. (#5964)
* fix issue5952

* We should only extend i8 and i16

* remove extra space

* move some code
2023-03-10 10:29:55 +00:00
yuyang
32cfd60877 fix codegen riscv64 normalize_cmp_value. (#5873)
* fix issue5839

* add target.

* fix normalize_cmp_value.

* fix test failutre.

* fix test failure.

* fix parameter type.

* Update cranelift/codegen/src/isa/riscv64/inst.isle

Co-authored-by: Jamey Sharp <jamey@minilop.net>

* Update cranelift/codegen/src/isa/riscv64/lower.isle

Co-authored-by: Jamey Sharp <jamey@minilop.net>

* remove convert rule from IntCC to ExtendOp

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-28 23:00:23 +00:00
Afonso Bordado
36e92add6f riscv64: Move is_null/is_invalid to ISLE (#5874)
* riscv64: Move `is_null`/`is_invalid` to ISLE

* riscv64: Fix `is_invalid` codegen

* Implement review suggestions

Thanks!

Co-authored-by: Jamey Sharp <jamey@minilop.net>

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-25 12:48:44 +00:00
Afonso Bordado
f6c6bc2155 riscv64: Improve signed and zero extend codegen (#5844)
* riscv64: Remove unused code

* riscv64: Group extend rules

* riscv64: Remove more unused rules

* riscv64: Cleanup existing extension rules

* riscv64: Move the existing Extend rules to ISLE

* riscv64: Use `sext.w` when extending

* riscv64: Remove duplicate extend tests

* riscv64: Use `zbb` instructions when extending values

* riscv64: Use `zbkb` extensions when zero extending

* riscv64: Enable additional tests for extend i128

* riscv64: Fix formatting for `Inst::Extend`

* riscv64: Reverse register for pack

* riscv64: Misc Cleanups

* riscv64: Cleanup extend rules
2023-02-22 17:41:14 +00:00
Afonso Bordado
6e6a1034d7 riscv64: Add bitmanip extension flags (#5847) 2023-02-21 22:12:44 +00:00
Afonso Bordado
0f51338def riscv64: Clear the top 32bits in the br_table index (#5831)
We were unintentionally relying on these to be zeroed when jumping.
2023-02-21 18:05:51 +00:00
Trevor Elliott
d99783fc91 Move default blocks into jump tables (#5756)
Move the default block off of the br_table instrution, and into the JumpTable that it references.
2023-02-10 08:53:30 -08:00
Alex Crichton
de0e0bea3f Legalize b{and,or,xor}_not into component instructions (#5709)
* Remove trailing whitespace in `lower.isle` files

* Legalize the `band_not` instruction into simpler form

This commit legalizes the `band_not` instruction into `band`-of-`bnot`,
or two instructions. This is intended to assist with egraph-based
optimizations where the `band_not` instruction doesn't have to be
specifically included in other bit-operation-patterns.

Lowerings of the `band_not` instruction have been moved to a
specialization of the `band` instruction.

* Legalize `bor_not` into components

Same as prior commit, but for the `bor_not` instruction.

* Legalize bxor_not into bxor-of-bnot

Same as prior commits. I think this also ended up fixing a bug in the
s390x backend where `bxor_not x y` was actually translated as `bnot
(bxor x y)` by accident given the test update changes.

* Simplify not-fused operands for riscv64

Looks like some delegated-to rules have special-cases for "if this
feature is enabled use the fused instruction" so move the clause for
testing the feature up to the lowering phase to help trigger other rules
if the feature isn't enabled. This should make the riscv64 backend more
consistent with how other backends are implemented.

* Remove B{and,or,xor}Not from cost of egraph metrics

These shouldn't ever reach egraphs now that they're legalized away.

* Add an egraph optimization for `x^-1 => ~x`

This adds a simplification node to translate xor-against-minus-1 to a
`bnot` instruction. This helps trigger various other optimizations in
the egraph implementation and also various backend lowering rules for
instructions. This is chiefly useful as wasm doesn't have a `bnot`
equivalent, so it's encoded as `x^-1`.

* Add a wasm test for end-to-end bitwise lowerings

Test that end-to-end various optimizations are being applied for input
wasm modules.

* Specifically don't self-update rustup on CI

I forget why this was here originally, but this is failing on Windows
CI. In general there's no need to update rustup, so leave it as-is.

* Cleanup some aarch64 lowering rules

Previously a 32/64 split was necessary due to the `ALUOp` being different
but that's been refactored away no so there's no longer any need for
duplicate rules.

* Narrow a x64 lowering rule

This previously made more sense when it was `band_not` and rarely used,
but be more specific in the type-filter on this rule that it's only
applicable to SIMD types with lanes.

* Simplify xor-against-minus-1 rule

No need to have the commutative version since constants are already
shuffled right for egraphs

* Optimize band-of-bnot when bnot is on the left

Use some more rules in the egraph algebraic optimizations to
canonicalize band/bor/bxor with a `bnot` operand to put the operand on
the right. That way the lowerings in the backends only have to list the
rule once, with the operand on the right, to optimize both styles of
input.

* Add commutative lowering rules

* Update cranelift/codegen/src/isa/x64/lower.isle

Co-authored-by: Jamey Sharp <jamey@minilop.net>

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-06 13:53:40 -06:00
yuyang
cb3b6c621f fix rotl.i16 with i128 shift value. (#5611)
* fix issue 5523.

* fix.

* add missing issue file.

* fix issue.

* fix duplicate shamt_128.

* issue 5523 add test target,and fix some wrong comment.

* fix output file.

* enable llvm_abi_extensions for regression test file.
2023-02-01 03:44:13 +00:00
Trevor Elliott
a5698cedf8 cranelift: Remove brz and brnz (#5630)
Remove the brz and brnz instructions, as their behavior is now redundant with brif.
2023-01-30 20:34:56 +00:00
Trevor Elliott
7926808e8e riscv64: improve unordered comparison generated code (#5636)
Improve the generated code for unordered floating point comparisons by negating the comparison and inveritng the branches. This allows us to pick the unordered versions, which generate significantly better code.
2023-01-25 17:28:28 -08:00
Trevor Elliott
b58a197d33 cranelift: Add a conditional branch instruction with two targets (#5446)
Add a conditional branch instruction with two targets: brif. This instruction will eventually replace brz and brnz, as it encompasses the behavior of both.

This PR also changes the InstructionData layout for instruction formats that hold BlockCall values, taking the same approach we use for Value arguments. This allows branch_destination to return a slice to the BlockCall values held in the instruction, rather than requiring that we pattern match on InstructionData to fetch the then/else blocks.

Function generation for fuzzing has been updated to generate uses of brif, and I've run the cranelift-fuzzgen target locally for hours without triggering any new failures.
2023-01-24 14:37:16 -08:00
yuyang
299b8187f8 fix issue 5525. (#5603)
* fix issue 5525.

* reg alloc changed.
2023-01-20 09:53:54 -08:00
Trevor Elliott
1e6c13d83e cranelift: Rework block instructions to use BlockCall (#5464)
Add a new type BlockCall that represents the pair of a block name with arguments to be passed to it. (The mnemonic here is that it looks a bit like a function call.) Rework the implementation of jump, brz, and brnz to use BlockCall instead of storing the block arguments as varargs in the instruction's ValueList.

To ensure that we're processing block arguments from BlockCall values in instructions, three new functions have been introduced on DataFlowGraph that both sets of arguments:

inst_values - returns an iterator that traverses values in the instruction and block arguments
map_inst_values - applies a function to each value in the instruction and block arguments
overwrite_inst_values - overwrite all values in an instruction and block arguments with values from the iterator

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-01-17 16:31:15 -08:00
Trevor Elliott
5d429e46e8 Remove the MInst::TrapFf constructor from the riscv64 backend (#5515)
Remove the MInst::TrapFf instruction in the riscv64 backend. It was only used in two places in the emit case for FloatRound, and was easily replaced with a combination of FpuRRR and TrapIf.
2023-01-04 13:34:46 -08:00
Trevor Elliott
b2d5afdf83 riscv64: Implement fcmp in ISLE (#5512)
Rework the compilation of fcmp in the riscv64 backend to be in ISLE, removing the need for the dedicated Fcmp instruction. This change is motivated by #5500, which showed that the riscv64 backend was generating branch instructions in the middle of a basic block.

We can't remove lower_br_fcmp quite yet as it's used in a few places in the emit module, but it's now no longer reachable from the ISLE lowerings.

Fixes #5500
2023-01-04 11:52:00 -08:00
Afonso Bordado
52ba72f341 riscv64: Fix masking on iabs (#5505)
* cranelift: Add `iabs.i128` runtest

* riscv64: Fix incorrect extension in iabs

When lowering iabs, we were accidentally comparing the unextended value
this caused the instruction to misbehave with certain top bits.

This commit also adds a zbb lowering that does not use jumps.
2023-01-03 17:37:25 -08:00
Trevor Elliott
fac4a915a3 Assert that we only use virtual registers with moves (#5440)
Assert that we never see real registers as arguments to move instructions in VCodeBuilder::collect_operands.

Also fix a bug in the riscv64 backend that was discovered by these assertions: the lowerings of get_stack_pointer and get_frame_pointer were using physical registers 8 and 2 directly. The solution was similar to other backends: add a move instruction specifically for moving out of physical registers, whose source operand is opaque to regalloc2.
2022-12-20 18:22:47 -08:00
Chris Fallin
22439f7b39 support select_spectre_guard and select on i128 conditions on all platforms. (#5460)
Fixes #5199.
Fixes #5200.
Fixes #5452.
Fixes #5453.

On riscv64, there is apparently an autoconversion from `ValueRegs` to
`Reg` that takes just the low register [0], and removing this conversion
causes 48 errors. As a result of this, `select` with an `i128` condition
was silently miscompiling, testing only the low 64 bits. We should
remove this autoconversion to ensure we aren't missing any other silent
truncations, but for now this PR just adds the explicit `I128` logic for
`select` / `select_spectre_guard`.

[0]
d9fdbfd50e/cranelift/codegen/src/isa/riscv64/inst.isle (L1762)
2022-12-16 14:18:22 -08:00
Ulrich Weigand
f0af622208 Simplify LowerBackend interface (#5432)
* Refactor lower_branch to have Unit result

Branches cannot have any output, so it is more straightforward
to have the ISLE term return Unit instead of InstOutput.

Also provide a new `emit_side_effect` term to simplify
implementation of `lower_branch` rules with Unit result.

* Simplify LowerBackend interface

Move all remaining asserts from the LowerBackend::lower and
::lower_branch_group into the common call site.

Change return value of ::lower to Option<InstOutput>, and
return value of ::lower_branch_group to Option<()> to match
ISLE term signature.

Only pass the first branch into ::lower_branch_group and
rename it to ::lower_branch.

As a result of all those changes, LowerBackend routines
now consists solely to calls to the corresponding ISLE
routines.
2022-12-14 00:48:25 +00:00
Chris Fallin
92ce79366c riscv64: remove valueregs_2_reg extractor. (#5426)
This extractor had a side-effect of invoking `put_in_regs`, which is not
supposed to be invoked until the pattern-matching commits to evaluating
a rule right-hand side (i.e., cannot backtrack). In this case the
side-effect was mostly benign (in theory it could have caused additional
values to be computed needlessly), but in general we should be careful
to keep side-effects out of the left-hand side to enable further
optimizations and work on islec.

The implicit conversion from `Value` to `Reg` turns out to be enough to
make the rules in question work, so we can simply remove the use of the
extractor in this case.
2022-12-13 11:47:20 -08:00
Ulrich Weigand
e913cf3647 Remove IFLAGS/FFLAGS types (#5406)
All instructions using the CPU flags types (IFLAGS/FFLAGS) were already
removed.  This patch completes the cleanup by removing all remaining
instructions that define values of CPU flags types, as well as the
types themselves.

Specifically, the following features are removed:
- The IFLAGS and FFLAGS types and the SpecialType category.
- Special handling of IFLAGS and FFLAGS in machinst/isle.rs and
  machinst/lower.rs.
- The ifcmp, ifcmp_imm, ffcmp, iadd_ifcin, iadd_ifcout, iadd_ifcarry,
  isub_ifbin, isub_ifbout, and isub_ifborrow instructions.
- The writes_cpu_flags instruction property.
- The flags verifier pass.
- Flags handling in the interpreter.

All of these features are currently unused; no functional change
intended by this patch.

This addresses https://github.com/bytecodealliance/wasmtime/issues/3249.
2022-12-09 13:42:03 -08:00
Jamey Sharp
8726eeefb3 cranelift-isle: Add "partial" flag for constructors (#5392)
* cranelift-isle: Add "partial" flag for constructors

Instead of tying fallibility of constructors to whether they're either
internal or pure, this commit assumes all constructors are infallible
unless tagged otherwise with a "partial" flag.

Internal constructors without the "partial" flag are not allowed to use
constructors which have the "partial" flag on the right-hand side of any
rules, because they have no way to report last-minute match failures.

Multi-constructors should never be "partial"; they report match failures
with an empty iterator instead. In turn this means you can't use partial
constructors on the right-hand side of internal multi-constructor rules.
However, you can use the same constructors on the left-hand side with
`if` or `if-let` instead.

In many cases, ISLE can already trivially prove that an internal
constructor always returns `Some`. With this commit, those cases are
largely unchanged, except for removing all the `Option`s and `Some`s
from the generated code for those terms.

However, for internal non-partial constructors where ISLE could not
prove that, it now emits an `unreachable!` panic as the last-resort,
instead of returning `None` like it used to do. Among the existing
backends, here's how many constructors have these panic cases:

- x64: 14% (53/374)
- aarch64: 15% (41/277)
- riscv64: 23% (26/114)
- s390x: 47% (268/567)

It's often possible to rewrite rules so that ISLE can tell the panic can
never be hit. Just ensure that there's a lowest-priority rule which has
no constraints on the left-hand side.

But in many of these constructors, it's difficult to statically prove
the unhandled cases are unreachable because that's only down to
knowledge about how they're called or other preconditions.

So this commit does not try to enforce that all terms have a last-resort
fallback rule.

* Check term flags while translating expressions

Instead of doing it in a separate pass afterward.

This involved threading all the term flags (pure, multi, partial)
through the recursive `translate_expr` calls, so I extracted the flags
to a new struct so they can all be passed together.

* Validate multi-term usage

Now that I've threaded the flags through `translate_expr`, it's easy to
check this case too, so let's just do it.

* Extract `ReturnKind` to use in `ExternalSig`

There are only three legal states for the combination of `multi` and
`infallible`, so replace those fields of `ExternalSig` with a
three-state enum.

* Remove `Option` wrapper from multi-extractors too

If we'd had any external multi-constructors this would correct their
signatures as well.

* Update ISLE tests

* Tag prelude constructors as pure where appropriate

I believe the only reason these weren't marked `pure` before was because
that would have implied that they're also partial. Now that those two
states are specified separately we apply this flag more places.

* Fix my changes to aarch64 `lower_bmask` and `imm` terms
2022-12-07 17:16:03 -08:00
Nick Fitzgerald
f0c4b6f3a1 Cranelift: Implement iadd_cout on x64 for 32- and 64-bit integers (#5285)
* Split the `iadd_cout` runtests by type

* Implement `iadd_cout` for 32- and 64-bit values on x64

* Delete trailing whitespace in `riscv/lower.isle`
2022-12-07 19:54:14 +00:00
Jamey Sharp
29b23d41b6 ISLE rule cleanups (#5389)
* cranelift-codegen: Use ISLE matching, not same_value

The `same_value` function just wrapped an equality test into an external
constructor, but we can do that with ISLE's equality constraints
instead.

* riscv64: Remove custom condition-code tests

The `lower_icmp` term exists solely to decide whether to sign-extend or
zero-extend the comparison operands, based on whether the condition code
requires a signed comparison. It additionally tested whether the
condition code was == or !=, but produced the same result as for other
unsigned comparisons.

We already have `signed_cond_code` in the ISLE prelude, which classifies
the total-ordering condition codes according to whether they're signed.
It also lumps == and != in the "unsigned" camp, as desired.

So this commit uses the existing method from the prelude instead of
riscv64-local definitions.

Because this version has no constraints on the left-hand side of the
rule in the unsigned case, ISLE generates Rust that always returns
`Some`. That shows that the current use of `unwrap` is justified, at the
only Rust-side call-site of `constructor_lower_icmp`, which is in
cranelift/codegen/src/isa/riscv64/lower/isle.rs.

* ISLE prelude: make offset32 infallible

This extractor always returns `Some`, so it doesn't need to be fallible.
2022-12-07 02:55:59 +00:00
Trevor Elliott
293bb5b334 riscv64: Only emit jumps at the end of basic blocks (#5381)
This PR fixes two bugs in the riscv64 backend, where branch instructions were emitted in the middle of a basic block:

Constant emission, where the constants are inlined into the vcode and are jumped over at runtime,
The BrTableCheck pseudo-instruction, which was always emitted before a BrTable instruction, and would handle jumping to the default label.
The first bug was resolved by introducing two new psuedo instructions, LoadConst32 and LoadConst64. Both of these instructions serve to delay the original encoding to emission time, after regalloc2 has run.

The second bug was fixed by removing the BrTableCheck instruction. As it was always emitted directly before BrTable, it was easier to remove it and merge the two into a single instruction.
2022-12-06 10:54:10 -08:00
Trevor Elliott
b475b9bd19 Terminate blocks with a single branch in riscv64 (#5374)
Ensure that we're terminating blocks with a single branch instruction, when testing I128 values against zero.
2022-12-05 20:13:28 +00:00
Trevor Elliott
b077854b57 Generate SSA code from returns (#5172)
Modify return pseudo-instructions to have pairs of registers: virtual and real. This allows us to constrain the virtual registers to the real ones specified by the abi, instead of directly emitting moves to those real registers.
2022-11-08 16:00:49 -08:00
Afonso Bordado
3ef30b5b67 cranelift: Rename i{min,max} to s{min,max} (#5187)
This brings these instructions with our general naming convention
of signed instructions being prefixed with `s`.
2022-11-03 18:20:33 +00:00
Afonso Bordado
879b52825f cranelift: Implement ineg.i128 for everyone (#5129)
* cranelift: Add `ineg` runtests

* aarch64: Implement `ineg.i128`

* x64: Implement `ineg.i128`

* riscv: Implement `ineg.i128`

* fuzzgen: Enable `ineg.i128`
2022-10-28 16:10:00 -07:00
Afonso Bordado
e8f3d03bbe cranelift: Mask high bits on bmask for types smaller than a register (#5118)
* aarch64: Fix incorrect masking for small types on bmask

`bmask` was accidentally relying on the uppermost bits of the register
for small types.

This was found by fuzzgen,  when it generated a shift left followed by
a bmask, the shift left shifted the bits out of the range of the input
type (i8), however these are not automatically cleared since they
remained inside the 32 bits of the register.

That caused issues when the bmask tried to compare the whole register
instead of just the bottom bits. The solution here is to mask the upper
bits for small types.

* aarch64: Emit 32bit cmp on bmask

This fixes an issue where bmask was accidentally comparing the
upper bits of the register by always using a 64bit cmp.

* riscv: Mask high bits in bmask

* riscv: Add compile tests for br{z,nz}

* riscv: Use shifts to mask 32bit values

This produces less code than the AND since that version needs to
load an immediate constant from memory.

* cranelift: Update test input to hexadecimal values

This makes it a bit more clear what is being tested.

* riscv: Use addiw for masking 32 bit values

Co-authored-by: Trevor Elliott <telliott@fastly.com>

* aarch64: Update bmask rule priority

Co-authored-by: Trevor Elliott <telliott@fastly.com>
2022-10-27 09:45:39 -07:00
Trevor Elliott
02620441c3 Add uadd_overflow_trap (#5123)
Add a new instruction uadd_overflow_trap, which is a fused version of iadd_ifcout and trapif. Adding this instruction removes a dependency on the iflags type, and would allow us to move closer to removing it entirely.

The instruction is defined for the i32 and i64 types only, and is currently only used in the legalization of heap_addr.
2022-10-27 09:43:15 -07:00
Trevor Elliott
ec12415b1f cranelift: Remove redundant branch and select instructions (#5097)
As discussed in the 2022/10/19 meeting, this PR removes many of the branch and select instructions that used iflags, in favor if using brz/brnz and select in their place. Additionally, it reworks selectif_spectre_guard to take an i8 input instead of an iflags input.

For reference, the removed instructions are: br_icmp, brif, brff, trueif, trueff, and selectif.
2022-10-24 16:14:35 -07:00
Trevor Elliott
32a7593c94 cranelift: Remove booleans (#5031)
Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements.

Fixes #3205

Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com>
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
2022-10-17 16:00:27 -07:00
yuyang
07584f6ac8 fix issue 4996. (#5003) 2022-10-04 11:18:42 -07:00
Trevor Elliott
c1d6ca48a7 ISLE: Resolve overlap in the riscv64 backend (#4982)
Resolve overlap in the RiscV64 backend by adding priorities to rules. Additionally, one test updated as a result of this work, as a peephole optimization for addition with immediates fires now.
2022-09-29 17:22:25 -07:00
yuyang-ok
cdecc858b4 add riscv64 backend for cranelift. (#4271)
Add a RISC-V 64 (`riscv64`, RV64GC) backend.

Co-authored-by: yuyang <756445638@qq.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
Co-authored-by: Afonso Bordado <afonsobordado@az8.co>
2022-09-27 17:30:31 -07:00