Commit Graph

4314 Commits

Author SHA1 Message Date
Trevor Elliott
aba239e9b8 Fix handling of jumps in bugpoint (#5794)
Fixes #5792
2023-02-15 15:07:03 -08:00
Afonso Bordado
76539ef9f2 cranelift: Optimize select+icmp into {s,u}{min,max} (#5546)
* cranelift: Optimize `select+icmp` into `{s,u}{min,max}`

* cranelift: Add generic egraph icmp reverse rule

* cranelift: Optimize `vselect+icmp` into `{s,u}{min,max}`

* cranelift: Optimize some `vselect+fcmp` into `f{min,max}_pseudo`

* cranelift: Add inverted forms of min/max rules
2023-02-15 15:06:21 -08:00
Trevor Elliott
f0137c2618 x64: Fix the formatting for andn (#5789)
* Print AluRmRVex instructions with the destination last
* Update andn tests
2023-02-15 11:16:59 -08:00
Alex Crichton
0037b71b11 Use xmm_rm_r more frequently in x64 backend (#5787)
This updates the signatures of the `xmm_rm_r` helper function and then
updates existing users and migrates other users to the helper now that
the type information is no longer required.
2023-02-15 09:03:19 -08:00
Ulrich Weigand
305000d14b s390x: Fix instruction encoding and disassembly format bugs (#5786)
- Fix encoding of the AHY instruction.
- Fix disassembly format of FIEBR, FIDBR, and LEDBRA instructions.
2023-02-15 08:36:44 -08:00
Ulrich Weigand
e10094dcd6 s390x: Support scalar min/max clif instructions (#5762)
We don't have ISA instructions for that, so simply expand them
to icmp + select.

Also enable fuzzing for those clif instructions now.
2023-02-15 11:45:09 +00:00
Alphyr
cb150d37ce Update dependencies (#5513) 2023-02-14 19:45:15 +00:00
Nick Fitzgerald
6df3bbbe60 Cranelift: Collapse double extends into a single extend (#5772) 2023-02-13 22:43:17 +00:00
Trevor Elliott
19f337e29b Move the default block to the front of the underlying jump table storage (#5770)
The new api on JumpTableData makese it easy to keep the default label first, and that shrinks the diff in #5731 a bit.
2023-02-13 20:50:29 +00:00
Alex Crichton
a0a97f5e8f Add (bnot (bxor x y)) lowerings for s390x/aarch64 (#5763)
* Add (bnot (bxor x y)) lowerings for s390x/aarch64

I originally thought that s390x's original lowering in #5709, but as was
rightfully pointed out `(bnot (bxor x y))` is equivalent to
`(bxor x (bnot y))` so the special lowering for one should apply as a
special lowering for the other. For the s390x and aarch64 backend that
have already have a fused lowering of the bxor/bnot add a lowering
additionally for the bnot/bxor combination.

* Add bnot(bxor(..)) tests for s390x 128-bit sizes
2023-02-13 15:41:18 +00:00
Trevor Elliott
d99783fc91 Move default blocks into jump tables (#5756)
Move the default block off of the br_table instrution, and into the JumpTable that it references.
2023-02-10 08:53:30 -08:00
Trevor Elliott
15fe9c7c93 Inline jump tables in parsed br_table instructions (#5755)
As jump tables are used by at most one br_table instruction, inline their definition in those instructions instead of requiring them to be declared as function-level metadata.
2023-02-09 14:24:04 -08:00
bjorn3
202d3af16a Remove the unused sigid argument purpose (#5753) 2023-02-09 09:18:39 -08:00
Amanieu d'Antras
a2d356d45e Add JITBuilder::with_flags constructor (#5751)
This allows custom flags to be set (e.g. `opt-level`) while still
leaving most of of the boilerplate to select the native target to
the `JITBuilder`.
2023-02-09 02:49:17 +00:00
Ayomide Bamidele
9637840b4b Update AMD and generic x86 CPU presets to match LLVM (#5575)
* Add x86 AMD and generic presets

* Fix typos

* Add zen2 definition
2023-02-08 15:56:45 -08:00
Trevor Elliott
34ec4b4e44 Reuse inst_predicates::visit_block_succs in more places (#5750)
Following up from #5730, replace some explicit matching over branch instructions with a use of inst_predicates::visit_block_succs.
2023-02-08 15:42:24 -08:00
Trevor Elliott
b0b3f67cb0 Move jump tables to the DataFlowGraph (#5745)
Move the storage for jump tables off of FunctionStencil and onto DataFlowGraph. This change is in service of #5731, making it easier to access the jump table data in the context of helpers like inst_values.
2023-02-07 21:21:35 -08:00
Jamey Sharp
7bf89683e9 Generalize and/or/xor optimizations (#5744)
* Generalize `n ^ !n` optimization to more types

* Generalize `x & -1` optimization to more types

Also mark the `x & x` rewrite to `subsume`.

* Cranelift: Optimize x|!x and x&!x to constants

These cases are much like the existing x^!x rules.
2023-02-08 02:18:36 +00:00
Trevor Elliott
d71c9458dc Make DataFlowGraph::blocks public (#5740)
Similar to when we exposed the DataFlowGraph::insts field through a restrictive newtype, expose DataFlowGraph::blocks through an interface that allows a restrictive set of operations. This field being public now allows us to avoid a rematch in ssa construction, and simplifies the implementation of adding a block argument to a block referenced by a br_table instruction.
2023-02-07 17:11:14 -08:00
Jamey Sharp
f3b408d5e2 Algebraic opts: Reuse iconst 0 from LHS (#5724)
We don't need to spend time going through the GVN map to dedup a
newly-constructed `iconst 0` when we already matched that value on the
left-hand side of these rules.

Also, mark these rules as subsuming any others since we can't do better
than reducing an expression to a constant.
2023-02-08 00:11:07 +00:00
Trevor Elliott
116e5a665f Bump regalloc2 to 0.6.0 (#5742)
* Bump regalloc2
* Certify regalloc2 0.6.0
2023-02-07 15:57:49 -08:00
Trevor Elliott
3343cf80e9 Add assertions for matches that used to use analyze_branch (#5733)
Following up from #5730, add debug assertions to ensure that new branch instructions don't slip through matches that used to use analyze_branch.
2023-02-07 14:51:18 -08:00
Trevor Elliott
2c8425998b Refactor matches that used to consume BranchInfo (#5734)
Explicitly borrow the instruction data, and use a mutable borrow to avoid rematch.
2023-02-07 13:29:42 -08:00
Raekye
fdd4a778fc Fix ABI of jitted function in cranelift-jit example. (#5736) 2023-02-07 21:16:05 +00:00
Alex Crichton
72962c9f08 Add some minor souper-harvested optimizations (#5735)
I was playing around with souper recently on some wasms I had lying
around and these are some optimization opportunities that popped out
which seemed easy-enough to add to the egraph-based optimizations.
2023-02-07 14:06:24 -06:00
Chris Fallin
673b448cfe Cranelift DFG: make inst clone deep-clone varargs lists. (#5727)
When investigating #5716, I found that rematerialization of a `call`, in
addition to blowing up for other reasons, caused aliasing of the varargs
list (the `EntityList` in the `ListPool`), such that editing the args of
the second copy of the call instruction inadvertently updated the first
as well.

This PR modifies `DataFlowGraph::clone_inst` so that it always clones
the varargs list if present. This shouldn't have any functional impact
on Cranelift today, because we don't rematerialize any instructions with
varargs; but it's important to get it right to avoid a bug later!
2023-02-07 01:21:09 +00:00
Trevor Elliott
c8a6adf825 Remove analyze_branch and BranchInfo (#5730)
We don't have overlap in behavior for branch instructions anymore, so we can remove analyze_branch and instead match on the InstructionData directly.

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-06 17:06:57 -08:00
Chris Fallin
75ae976adc egraphs: fix accidental remat of call. (#5726)
In the provided test case in #5716, the result of a call was then
added to 0. We have a rewrite rule that sets the remat-bit on any add
of a value and a constant, because these frequently appear (e.g. from
address offset calculations) and this can frequently reduce register
pressure (one long-lived base vs. many long-lived base+offset values).
Separately, we have an algebraic rule that `x+0` rewrites to `x`.

The result of this was that we had an eclass with the remat bit set on
the add, but the add was also union'd into the call. We pick the
latter during extraction, because it's cheaper not to do the add at
all; but we still get the remat bit, and try to remat a call (!),
which blows up later.

This PR fixes the logic to look up the "best value" for a value (i.e.,
whatever extraction determined), and look up the remat bit on *that*
node, not the canonical node.

(Why did the canonical node become the iadd and not the call? Because
the former had a lower value-number, as an accident of IR
construction; we don't impose any requirements on the input CLIF's
value-number ordering, and I don't think this breaks any of the
important acyclic properties, even though there is technically a
dependence from a lower-numbered to a higher-numbered node. In essence
one can think of them as having "virtual numbers" in any true
topologically-sorted order, and the only place the actual integer
indices matter should be in choosing the "canonical ID", which is just
used for dedup'ing, modulo this bug.)

Fixes #5716.
2023-02-06 23:36:16 +00:00
bjorn3
16afefdab1 Some refactorings to the ISLE parser (#5693)
* Use is_ascii_digit and is_ascii_hexdigit in the ISLE lexer

* Use range pattern in ISLE lexer

* Use a couple of shorthands in the ISLE parser

* Use parse_ident instead of symbol + str_to_ident

* Introduce token eating api

This is a non-fatal version of the take api

* Rename take to expect and add expect_ prefixes to several methods

* Review comments
2023-02-06 15:11:25 -08:00
Trevor Elliott
e9c05622c0 Keep reachable jump tables (#5721)
Instead of identifying unused branch tables by looking for unused blocks inside of them, track used branch tables while traversing reachable blocks. This introduces an extra allocation of an EntitySet to track the used jump tables, but as those are few and this function runs once per ir::Function, the allocation seems reasonable.
2023-02-06 14:10:47 -08:00
Jamey Sharp
65c1f654f2 Cranelift: Only build iconst for ints <= 64 bits (#5723)
I audited the egraph "algebraic" optimization rules for any which
construct an `iconst` on the right-hand side of the rule. In these cases
we need to constrain the type passed to `iconst` to be both `fits_in_64`
and `ty_int`, because `iconst` is not defined on other types.
2023-02-06 14:10:29 -08:00
Alex Crichton
de0e0bea3f Legalize b{and,or,xor}_not into component instructions (#5709)
* Remove trailing whitespace in `lower.isle` files

* Legalize the `band_not` instruction into simpler form

This commit legalizes the `band_not` instruction into `band`-of-`bnot`,
or two instructions. This is intended to assist with egraph-based
optimizations where the `band_not` instruction doesn't have to be
specifically included in other bit-operation-patterns.

Lowerings of the `band_not` instruction have been moved to a
specialization of the `band` instruction.

* Legalize `bor_not` into components

Same as prior commit, but for the `bor_not` instruction.

* Legalize bxor_not into bxor-of-bnot

Same as prior commits. I think this also ended up fixing a bug in the
s390x backend where `bxor_not x y` was actually translated as `bnot
(bxor x y)` by accident given the test update changes.

* Simplify not-fused operands for riscv64

Looks like some delegated-to rules have special-cases for "if this
feature is enabled use the fused instruction" so move the clause for
testing the feature up to the lowering phase to help trigger other rules
if the feature isn't enabled. This should make the riscv64 backend more
consistent with how other backends are implemented.

* Remove B{and,or,xor}Not from cost of egraph metrics

These shouldn't ever reach egraphs now that they're legalized away.

* Add an egraph optimization for `x^-1 => ~x`

This adds a simplification node to translate xor-against-minus-1 to a
`bnot` instruction. This helps trigger various other optimizations in
the egraph implementation and also various backend lowering rules for
instructions. This is chiefly useful as wasm doesn't have a `bnot`
equivalent, so it's encoded as `x^-1`.

* Add a wasm test for end-to-end bitwise lowerings

Test that end-to-end various optimizations are being applied for input
wasm modules.

* Specifically don't self-update rustup on CI

I forget why this was here originally, but this is failing on Windows
CI. In general there's no need to update rustup, so leave it as-is.

* Cleanup some aarch64 lowering rules

Previously a 32/64 split was necessary due to the `ALUOp` being different
but that's been refactored away no so there's no longer any need for
duplicate rules.

* Narrow a x64 lowering rule

This previously made more sense when it was `band_not` and rarely used,
but be more specific in the type-filter on this rule that it's only
applicable to SIMD types with lanes.

* Simplify xor-against-minus-1 rule

No need to have the commutative version since constants are already
shuffled right for egraphs

* Optimize band-of-bnot when bnot is on the left

Use some more rules in the egraph algebraic optimizations to
canonicalize band/bor/bxor with a `bnot` operand to put the operand on
the right. That way the lowerings in the backends only have to list the
rule once, with the operand on the right, to optimize both styles of
input.

* Add commutative lowering rules

* Update cranelift/codegen/src/isa/x64/lower.isle

Co-authored-by: Jamey Sharp <jamey@minilop.net>

---------

Co-authored-by: Jamey Sharp <jamey@minilop.net>
2023-02-06 13:53:40 -06:00
Afonso Bordado
99c3936616 fuzzgen: Enable rotl for riscv64 (#5715)
It was fixed in #5611
2023-02-06 18:27:31 +00:00
Jamey Sharp
23e1d6b5e3 egraphs/cprop: Don't extend constants to i128 (#5717)
Fixes #5711.
2023-02-06 17:34:21 +00:00
wasmtime-publish
482f541101 Bump Wasmtime to 7.0.0 (#5712)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2023-02-06 09:10:19 -06:00
Jamey Sharp
97381792ac Generalize u/sextend constant folding to all types (#5706)
Also move these optimization rules to cprop.isle; it's where all the
other similar rules are.

Like the other cprop rules, these can subsume any other rules. We can't
do better than reducing an expression to a constant.

The new i64_sextend_imm64 and u64_uextend_imm64 constructors are useful
helpers to clean up other code. I applied them to `imm64_icmp` while I
was here, as well as using the existing `ty_mask` helper to clean up
`imm64_masked`.
2023-02-03 17:29:21 -08:00
Trevor Elliott
6d8f2be9e1 Use andn for band_not when bmi1 is present (#5701)
We can use the andn instruction for the lowering of band_not on x64 when bmi1 is available.
2023-02-03 16:23:18 -08:00
Nick Fitzgerald
e18d4cb711 Cranelift: Introduce support for return_call in the interpreter (#5697)
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-03 15:53:54 -08:00
Nick Fitzgerald
72c8513411 Cranelift: Correctly wrap shifts in constant propagation (#5695)
Fixes #5690
Fixes #5696

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-03 00:12:57 +00:00
Jun Ryung Ju
9cd4146939 Implemented b{and,or,xor}_not bitops for ty_int_ref_scalar_64 type. (#5604)
* Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type.

* Added tests.
2023-02-01 21:57:18 -08:00
Jamey Sharp
ac4d28f4dd Constant-fold icmp instructions (#5666)
We found examples of icmp instructions with both operands constant in
spidermonkey.wasm.
2023-02-01 21:55:36 +00:00
Nick Fitzgerald
bdfb746548 Cranelift: Introduce the return_call and return_call_indirect instructions (#5679)
* Cranelift: Introduce the `tail` calling convention

This is an unstable-ABI calling convention that we will eventually use to
support Wasm tail calls.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* Cranelift: Introduce the `return_call` and `return_call_indirect` instructions

These will be used to implement tail calls for Wasm and any other language
targeting CLIF. The `return_call_indirect` instruction differs from the Wasm
instruction of the same name by taking a native address callee rather than a
Wasm function index.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* Cranelift: Implement verification rules for `return_call[_indirect]`

They must:

* have the same return types between the caller and callee,
* have the same calling convention between caller and callee,
* and that calling convention must support tail calls.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>

* cargo fmt

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
2023-02-01 21:20:35 +00:00
Nick Fitzgerald
ffbbfbffce Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) again (#5684)
This rewrite was introduced in #5676 and then reverted in #5682 due to a footgun
where we accidentally weren't actually checking the `y == !z` precondition. This
commit fixes the precondition check. It also fixes the arithmetic to be
correctly masked to the value type's width.

This reverts commit 268f6bfc1d.
2023-02-01 20:53:22 +00:00
yuyang
cb3b6c621f fix rotl.i16 with i128 shift value. (#5611)
* fix issue 5523.

* fix.

* add missing issue file.

* fix issue.

* fix duplicate shamt_128.

* issue 5523 add test target,and fix some wrong comment.

* fix output file.

* enable llvm_abi_extensions for regression test file.
2023-02-01 03:44:13 +00:00
Trevor Elliott
268f6bfc1d Revert "Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) (#5676)" (#5682)
This reverts commit 8c9eb9939b.

Fixes #5680
2023-02-01 02:53:23 +00:00
yuyang
0c66a1bba7 Fix issue 5528 (#5605)
* fix parameter error.

* fix float convert to i8 and i16   should extract sign bit.

* add missing regression test file.

* using tmp register.

* float convert i8 will consume more instructions.

* fix worse inst emit size.

* fix worst_case_size.
2023-01-31 15:37:36 -08:00
Nick Fitzgerald
8c9eb9939b Cranelift: Rewrite or(and(x, y), not(y)) => or(x, not(y)) (#5676)
Co-authored-by: Rainy Sinclair <844493+itsrainy@users.noreply.github.com>
2023-01-31 22:44:45 +00:00
Trevor Elliott
e82995f03c Add a convenience function for displaying a BlockCall (#5677)
Add a display method to BlockCall that returns a std::fmt::Displayable result. Rework the display code in the write module of cranelift-codegen to use this method instead.
2023-01-31 14:26:10 -08:00
Nick Fitzgerald
253e28ca4f Cranelift: Rewrite (x>>k)<<k into masking off the bottom k bits (#5673)
* Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits

* Add a runtest for exercising our rewrite of `(x >> k) << k` into masking
2023-01-31 21:11:12 +00:00
Alex Crichton
7f2c8e6344 Fix some warnings on nightly Rust (#5668)
* Fix some warnings on nightly Rust

Cargo is warning about the usage of workspace dependencies where the
workspace declaration does not mention `default-features` but the
dependency mentions `default-features`, so this explicitly turns off
default features for `cranelift-codegen` at the workspace level and
removes the explicit `default-features = false` at the manifest levels.

* Explicitly enable default feature in wasmtime

* Enable another feature
2023-01-31 20:54:58 +00:00