wasmtime

Author	SHA1	Message	Date
Trevor Elliott	d71c9458dc	Make `DataFlowGraph::blocks` public (#5740 ) Similar to when we exposed the DataFlowGraph::insts field through a restrictive newtype, expose DataFlowGraph::blocks through an interface that allows a restrictive set of operations. This field being public now allows us to avoid a rematch in ssa construction, and simplifies the implementation of adding a block argument to a block referenced by a br_table instruction.	2023-02-07 17:11:14 -08:00
Jamey Sharp	f3b408d5e2	Algebraic opts: Reuse `iconst 0` from LHS (#5724 ) We don't need to spend time going through the GVN map to dedup a newly-constructed `iconst 0` when we already matched that value on the left-hand side of these rules. Also, mark these rules as subsuming any others since we can't do better than reducing an expression to a constant.	2023-02-08 00:11:07 +00:00
Trevor Elliott	116e5a665f	Bump regalloc2 to 0.6.0 (#5742 ) * Bump regalloc2 * Certify regalloc2 0.6.0	2023-02-07 15:57:49 -08:00
Trevor Elliott	3343cf80e9	Add assertions for matches that used to use analyze_branch (#5733 ) Following up from #5730, add debug assertions to ensure that new branch instructions don't slip through matches that used to use analyze_branch.	2023-02-07 14:51:18 -08:00
Trevor Elliott	2c8425998b	Refactor matches that used to consume BranchInfo (#5734 ) Explicitly borrow the instruction data, and use a mutable borrow to avoid rematch.	2023-02-07 13:29:42 -08:00
Alex Crichton	72962c9f08	Add some minor souper-harvested optimizations (#5735 ) I was playing around with souper recently on some wasms I had lying around and these are some optimization opportunities that popped out which seemed easy-enough to add to the egraph-based optimizations.	2023-02-07 14:06:24 -06:00
Chris Fallin	673b448cfe	Cranelift DFG: make inst clone deep-clone varargs lists. (#5727 ) When investigating #5716, I found that rematerialization of a `call`, in addition to blowing up for other reasons, caused aliasing of the varargs list (the `EntityList` in the `ListPool`), such that editing the args of the second copy of the call instruction inadvertently updated the first as well. This PR modifies `DataFlowGraph::clone_inst` so that it always clones the varargs list if present. This shouldn't have any functional impact on Cranelift today, because we don't rematerialize any instructions with varargs; but it's important to get it right to avoid a bug later!	2023-02-07 01:21:09 +00:00
Trevor Elliott	c8a6adf825	Remove analyze_branch and BranchInfo (#5730 ) We don't have overlap in behavior for branch instructions anymore, so we can remove analyze_branch and instead match on the InstructionData directly. Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-06 17:06:57 -08:00
Chris Fallin	75ae976adc	egraphs: fix accidental remat of call. (#5726 ) In the provided test case in #5716, the result of a call was then added to 0. We have a rewrite rule that sets the remat-bit on any add of a value and a constant, because these frequently appear (e.g. from address offset calculations) and this can frequently reduce register pressure (one long-lived base vs. many long-lived base+offset values). Separately, we have an algebraic rule that `x+0` rewrites to `x`. The result of this was that we had an eclass with the remat bit set on the add, but the add was also union'd into the call. We pick the latter during extraction, because it's cheaper not to do the add at all; but we still get the remat bit, and try to remat a call (!), which blows up later. This PR fixes the logic to look up the "best value" for a value (i.e., whatever extraction determined), and look up the remat bit on that node, not the canonical node. (Why did the canonical node become the iadd and not the call? Because the former had a lower value-number, as an accident of IR construction; we don't impose any requirements on the input CLIF's value-number ordering, and I don't think this breaks any of the important acyclic properties, even though there is technically a dependence from a lower-numbered to a higher-numbered node. In essence one can think of them as having "virtual numbers" in any true topologically-sorted order, and the only place the actual integer indices matter should be in choosing the "canonical ID", which is just used for dedup'ing, modulo this bug.) Fixes #5716.	2023-02-06 23:36:16 +00:00
Trevor Elliott	e9c05622c0	Keep reachable jump tables (#5721 ) Instead of identifying unused branch tables by looking for unused blocks inside of them, track used branch tables while traversing reachable blocks. This introduces an extra allocation of an EntitySet to track the used jump tables, but as those are few and this function runs once per ir::Function, the allocation seems reasonable.	2023-02-06 14:10:47 -08:00
Jamey Sharp	65c1f654f2	Cranelift: Only build iconst for ints <= 64 bits (#5723 ) I audited the egraph "algebraic" optimization rules for any which construct an `iconst` on the right-hand side of the rule. In these cases we need to constrain the type passed to `iconst` to be both `fits_in_64` and `ty_int`, because `iconst` is not defined on other types.	2023-02-06 14:10:29 -08:00
Alex Crichton	de0e0bea3f	Legalize `b{and,or,xor}_not` into component instructions (#5709 ) * Remove trailing whitespace in `lower.isle` files * Legalize the `band_not` instruction into simpler form This commit legalizes the `band_not` instruction into `band`-of-`bnot`, or two instructions. This is intended to assist with egraph-based optimizations where the `band_not` instruction doesn't have to be specifically included in other bit-operation-patterns. Lowerings of the `band_not` instruction have been moved to a specialization of the `band` instruction. * Legalize `bor_not` into components Same as prior commit, but for the `bor_not` instruction. * Legalize bxor_not into bxor-of-bnot Same as prior commits. I think this also ended up fixing a bug in the s390x backend where `bxor_not x y` was actually translated as `bnot (bxor x y)` by accident given the test update changes. * Simplify not-fused operands for riscv64 Looks like some delegated-to rules have special-cases for "if this feature is enabled use the fused instruction" so move the clause for testing the feature up to the lowering phase to help trigger other rules if the feature isn't enabled. This should make the riscv64 backend more consistent with how other backends are implemented. * Remove B{and,or,xor}Not from cost of egraph metrics These shouldn't ever reach egraphs now that they're legalized away. * Add an egraph optimization for `x^-1 => ~x` This adds a simplification node to translate xor-against-minus-1 to a `bnot` instruction. This helps trigger various other optimizations in the egraph implementation and also various backend lowering rules for instructions. This is chiefly useful as wasm doesn't have a `bnot` equivalent, so it's encoded as `x^-1`. * Add a wasm test for end-to-end bitwise lowerings Test that end-to-end various optimizations are being applied for input wasm modules. * Specifically don't self-update rustup on CI I forget why this was here originally, but this is failing on Windows CI. In general there's no need to update rustup, so leave it as-is. * Cleanup some aarch64 lowering rules Previously a 32/64 split was necessary due to the `ALUOp` being different but that's been refactored away no so there's no longer any need for duplicate rules. * Narrow a x64 lowering rule This previously made more sense when it was `band_not` and rarely used, but be more specific in the type-filter on this rule that it's only applicable to SIMD types with lanes. * Simplify xor-against-minus-1 rule No need to have the commutative version since constants are already shuffled right for egraphs * Optimize band-of-bnot when bnot is on the left Use some more rules in the egraph algebraic optimizations to canonicalize band/bor/bxor with a `bnot` operand to put the operand on the right. That way the lowerings in the backends only have to list the rule once, with the operand on the right, to optimize both styles of input. * Add commutative lowering rules * Update cranelift/codegen/src/isa/x64/lower.isle Co-authored-by: Jamey Sharp <jamey@minilop.net> --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-06 13:53:40 -06:00
Jamey Sharp	23e1d6b5e3	egraphs/cprop: Don't extend constants to `i128` (#5717 ) Fixes #5711.	2023-02-06 17:34:21 +00:00
wasmtime-publish	482f541101	Bump Wasmtime to 7.0.0 (#5712 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2023-02-06 09:10:19 -06:00
Jamey Sharp	97381792ac	Generalize u/sextend constant folding to all types (#5706 ) Also move these optimization rules to cprop.isle; it's where all the other similar rules are. Like the other cprop rules, these can subsume any other rules. We can't do better than reducing an expression to a constant. The new i64_sextend_imm64 and u64_uextend_imm64 constructors are useful helpers to clean up other code. I applied them to `imm64_icmp` while I was here, as well as using the existing `ty_mask` helper to clean up `imm64_masked`.	2023-02-03 17:29:21 -08:00
Trevor Elliott	6d8f2be9e1	Use `andn` for `band_not` when bmi1 is present (#5701 ) We can use the andn instruction for the lowering of band_not on x64 when bmi1 is available.	2023-02-03 16:23:18 -08:00
Nick Fitzgerald	72c8513411	Cranelift: Correctly wrap shifts in constant propagation (#5695 ) Fixes #5690 Fixes #5696 Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-03 00:12:57 +00:00
Jun Ryung Ju	9cd4146939	Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type. (#5604 ) * Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type. * Added tests.	2023-02-01 21:57:18 -08:00
Jamey Sharp	ac4d28f4dd	Constant-fold icmp instructions (#5666 ) We found examples of icmp instructions with both operands constant in spidermonkey.wasm.	2023-02-01 21:55:36 +00:00
Nick Fitzgerald	bdfb746548	Cranelift: Introduce the `return_call` and `return_call_indirect` instructions (#5679 ) * Cranelift: Introduce the `tail` calling convention This is an unstable-ABI calling convention that we will eventually use to support Wasm tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Introduce the `return_call` and `return_call_indirect` instructions These will be used to implement tail calls for Wasm and any other language targeting CLIF. The `return_call_indirect` instruction differs from the Wasm instruction of the same name by taking a native address callee rather than a Wasm function index. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Implement verification rules for `return_call[_indirect]` They must: * have the same return types between the caller and callee, * have the same calling convention between caller and callee, * and that calling convention must support tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * cargo fmt --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-01 21:20:35 +00:00
Nick Fitzgerald	ffbbfbffce	Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` again (#5684 ) This rewrite was introduced in #5676 and then reverted in #5682 due to a footgun where we accidentally weren't actually checking the `y == !z` precondition. This commit fixes the precondition check. It also fixes the arithmetic to be correctly masked to the value type's width. This reverts commit `268f6bfc1d`.	2023-02-01 20:53:22 +00:00
yuyang	cb3b6c621f	fix rotl.i16 with i128 shift value. (#5611 ) * fix issue 5523. * fix. * add missing issue file. * fix issue. * fix duplicate shamt_128. * issue 5523 add test target,and fix some wrong comment. * fix output file. * enable llvm_abi_extensions for regression test file.	2023-02-01 03:44:13 +00:00
Trevor Elliott	268f6bfc1d	Revert "Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` (#5676 )" (#5682 ) This reverts commit `8c9eb9939b`. Fixes #5680	2023-02-01 02:53:23 +00:00
yuyang	0c66a1bba7	Fix issue 5528 (#5605 ) * fix parameter error. * fix float convert to i8 and i16 should extract sign bit. * add missing regression test file. * using tmp register. * float convert i8 will consume more instructions. * fix worse inst emit size. * fix worst_case_size.	2023-01-31 15:37:36 -08:00
Nick Fitzgerald	8c9eb9939b	Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` (#5676 ) Co-authored-by: Rainy Sinclair <844493+itsrainy@users.noreply.github.com>	2023-01-31 22:44:45 +00:00
Trevor Elliott	e82995f03c	Add a convenience function for displaying a BlockCall (#5677 ) Add a display method to BlockCall that returns a std::fmt::Displayable result. Rework the display code in the write module of cranelift-codegen to use this method instead.	2023-01-31 14:26:10 -08:00
Nick Fitzgerald	253e28ca4f	Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits (#5673 ) * Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits * Add a runtest for exercising our rewrite of `(x >> k) << k` into masking	2023-01-31 21:11:12 +00:00
Nick Fitzgerald	7aa240e0f2	Cranelift: constant propagate shifts (#5671 ) Thanks to Souper for pointing out we weren't doing this!	2023-01-31 12:06:53 -08:00
Trevor Elliott	10fcd14287	Remove unused code from the write module (#5674 ) The DisplayValuesWithDelimiter struct is no longer used.	2023-01-31 19:45:59 +00:00
Nick Fitzgerald	c9d1c068bc	Cranelift: Add egraph rule to rewrite `x * C ==> x << log2(C)` when `C` is a power of two (#5647 )	2023-01-31 18:04:17 +00:00
Nick Fitzgerald	bf4d0e9212	Cranelift: Fix `select` condition harvesting (#5662 ) Souper requires an `i1` condition value, we don't and will implicitly check against 0. We were truncating conditions but should actually be doing the comparison against `0`.	2023-01-30 21:25:14 -08:00
Trevor Elliott	b5692db7ce	Remove boolean parameters from instruction builder functions (#5658 ) Remove the boolean parameters from the instruction builder functions, as they were only ever used with true. Additionally, change the returns and branches functions to imply terminates_block.	2023-01-30 16:12:05 -08:00
Nick Fitzgerald	e4fa355866	cranelift: Generate the correct souper size for comparisons in LHSes (#5659 )	2023-01-30 15:32:47 -08:00
Trevor Elliott	a5698cedf8	cranelift: Remove brz and brnz (#5630 ) Remove the brz and brnz instructions, as their behavior is now redundant with brif.	2023-01-30 20:34:56 +00:00
yuyang	77cf547f41	fix issue 5569. (#5612 ) * add regression test file. * fix issute5569. * enable code length check.	2023-01-30 10:01:33 -08:00
Nick Fitzgerald	ffbcc67eb3	Cranelift: Consider shifts as "simple" arithmetic in egraph cost model (#5646 )	2023-01-27 16:30:42 -08:00
Saúl Cabrera	0f8393508a	cranelift-codegen: Expose `EmitState` and `EmitInfo` from aarch64 (#5640 ) This commit exposes `EmitState` and `EmitInfo` so that they can be consumed by Winch. This is a follow up to https://github.com/bytecodealliance/wasmtime/pull/5570, in which this should've been included.	2023-01-27 19:36:26 +00:00
Jamey Sharp	915801551b	Delete old cranelift-preopt crate (#5642 ) Most of these optimizations are in the egraph `cprop.isle` rules now, making a separate crate unnecessary. Also I think the `udiv` optimizations here are straight-up wrong (doing signed instead of unsigned division, and panicking instead of preserving traps on division by zero) so I'm guessing this crate isn't seriously used anywhere. At the least, bjorn3 confirms that cg_clif doesn't use this, and I've verified that Wasmtime doesn't either. Closes #1090.	2023-01-26 21:32:33 +00:00
Trevor Elliott	a181ad2932	Cleanup the use of `maybe_uextend` in the x64 lowerings (#5637 ) Use maybe_uextend for the brnz lowerings on x64.	2023-01-25 17:28:48 -08:00
Trevor Elliott	7926808e8e	riscv64: improve unordered comparison generated code (#5636 ) Improve the generated code for unordered floating point comparisons by negating the comparison and inveritng the branches. This allows us to pick the unordered versions, which generate significantly better code.	2023-01-25 17:28:28 -08:00
Trevor Elliott	b58a197d33	cranelift: Add a conditional branch instruction with two targets (#5446 ) Add a conditional branch instruction with two targets: brif. This instruction will eventually replace brz and brnz, as it encompasses the behavior of both. This PR also changes the InstructionData layout for instruction formats that hold BlockCall values, taking the same approach we use for Value arguments. This allows branch_destination to return a slice to the BlockCall values held in the instruction, rather than requiring that we pattern match on InstructionData to fetch the then/else blocks. Function generation for fuzzing has been updated to generate uses of brif, and I've run the cranelift-fuzzgen target locally for hours without triggering any new failures.	2023-01-24 14:37:16 -08:00
Saúl Cabrera	0e6e802c34	docs: Fix typo (#5620 )	2023-01-23 09:05:33 -06:00
yuyang	7e10bd1f58	fix issue #5497 #5524 #5526 . (#5595 ) * fix issue 5497. * fix issue 5524 * fix issue 5497 5524 5526. * some clif change because of reg alloc.	2023-01-20 14:06:26 -08:00
yuyang	299b8187f8	fix issue 5525. (#5603 ) * fix issue 5525. * reg alloc changed.	2023-01-20 09:53:54 -08:00
Chris Fallin	1faff8c2ce	Enable egraph-based optimization by default. (#5587 ) This PR follows up on #5382 and #5391, which rebuilt the egraph-based optimization framework to be more performant, by enabling it by default. Based on performance results in #5382 (my measurements on SpiderMonkey and bjorn3's independent confirmation with cg_clif), it seems that this is reasonable to enable. Now that we have been fuzzing compiler configurations with egraph opts (#5388) for 6 weeks, having fixed a few fuzzbugs that came up (#5409, #5420, #5438) and subsequently received no further reports from OSS-Fuzz, I believe it is stable enough to rely on. This PR enables `use_egraphs`, and also normalizes its meaning: previously it forced optimization (it basically meant "turn on the egraph optimization machinery"), now it runs egraph opts if the opt level indicates (it means "use egraphs to optimize if we are going to optimize"). The conditionals in the top-level pass driver are a little subtle, but will get simpler once we can remove the non-egraph path (which we plan to do eventually!). Fixes #5181.	2023-01-19 15:46:53 -08:00
Chris Fallin	704f5a5772	Cranelift/egraph mid-end: support merging effectful-but-idempotent ops (#5594 ) * Support mergeable-but-side-effectful (idempotent) operations in general in the egraph's GVN. This mirrors the similar change made in #5534. * Add tests for egraph case.	2023-01-19 11:51:19 -08:00
Ulrich Weigand	a2e9a608c1	fuzzgen: Enable s390x and disable unimplemented ops (#5596 ) Also fix assertion failure when using "i128 uext" or "i128 sext" arguments or return values, as discovered by the fuzzer.	2023-01-19 10:08:32 -08:00
Trevor Elliott	7cea73a81d	Refactor BranchInfo::Table to no longer have an optional default branch (#5593 )	2023-01-18 17:17:03 -08:00
Trevor Elliott	1e6c13d83e	cranelift: Rework block instructions to use BlockCall (#5464 ) Add a new type BlockCall that represents the pair of a block name with arguments to be passed to it. (The mnemonic here is that it looks a bit like a function call.) Rework the implementation of jump, brz, and brnz to use BlockCall instead of storing the block arguments as varargs in the instruction's ValueList. To ensure that we're processing block arguments from BlockCall values in instructions, three new functions have been introduced on DataFlowGraph that both sets of arguments: inst_values - returns an iterator that traverses values in the instruction and block arguments map_inst_values - applies a function to each value in the instruction and block arguments overwrite_inst_values - overwrite all values in an instruction and block arguments with values from the iterator Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-01-17 16:31:15 -08:00
Jamey Sharp	3a2ca67570	Generalize iterator types in compute_use_states (#5586 ) This is a cleanup to help prepare for #5464. Most of the diff is inlining the closure for `mark_all_uses_as_multiple` which was only called once. That avoids some borrow-checker challenges. The key change is that the former `push_args_on_stack` closure no longer actually pushes the iterator on the stack, but just returns it. That way this closure doesn't need the name of the stack's type. It also allows it to be reused in the debug_assert.	2023-01-17 22:17:17 +00:00

1 2 3 4 5 ...

2157 Commits