wasmtime

Author	SHA1	Message	Date
Nick Fitzgerald	72c8513411	Cranelift: Correctly wrap shifts in constant propagation (#5695 ) Fixes #5690 Fixes #5696 Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-03 00:12:57 +00:00
Jun Ryung Ju	9cd4146939	Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type. (#5604 ) * Implemented `b{and,or,xor}_not` bitops for ty_int_ref_scalar_64 type. * Added tests.	2023-02-01 21:57:18 -08:00
Jamey Sharp	ac4d28f4dd	Constant-fold icmp instructions (#5666 ) We found examples of icmp instructions with both operands constant in spidermonkey.wasm.	2023-02-01 21:55:36 +00:00
Nick Fitzgerald	bdfb746548	Cranelift: Introduce the `return_call` and `return_call_indirect` instructions (#5679 ) * Cranelift: Introduce the `tail` calling convention This is an unstable-ABI calling convention that we will eventually use to support Wasm tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Introduce the `return_call` and `return_call_indirect` instructions These will be used to implement tail calls for Wasm and any other language targeting CLIF. The `return_call_indirect` instruction differs from the Wasm instruction of the same name by taking a native address callee rather than a Wasm function index. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * Cranelift: Implement verification rules for `return_call[_indirect]` They must: * have the same return types between the caller and callee, * have the same calling convention between caller and callee, * and that calling convention must support tail calls. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> * cargo fmt --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-01 21:20:35 +00:00
Nick Fitzgerald	ffbbfbffce	Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` again (#5684 ) This rewrite was introduced in #5676 and then reverted in #5682 due to a footgun where we accidentally weren't actually checking the `y == !z` precondition. This commit fixes the precondition check. It also fixes the arithmetic to be correctly masked to the value type's width. This reverts commit `268f6bfc1d`.	2023-02-01 20:53:22 +00:00
yuyang	cb3b6c621f	fix rotl.i16 with i128 shift value. (#5611 ) * fix issue 5523. * fix. * add missing issue file. * fix issue. * fix duplicate shamt_128. * issue 5523 add test target,and fix some wrong comment. * fix output file. * enable llvm_abi_extensions for regression test file.	2023-02-01 03:44:13 +00:00
Trevor Elliott	268f6bfc1d	Revert "Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` (#5676 )" (#5682 ) This reverts commit `8c9eb9939b`. Fixes #5680	2023-02-01 02:53:23 +00:00
yuyang	0c66a1bba7	Fix issue 5528 (#5605 ) * fix parameter error. * fix float convert to i8 and i16 should extract sign bit. * add missing regression test file. * using tmp register. * float convert i8 will consume more instructions. * fix worse inst emit size. * fix worst_case_size.	2023-01-31 15:37:36 -08:00
Nick Fitzgerald	8c9eb9939b	Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` (#5676 ) Co-authored-by: Rainy Sinclair <844493+itsrainy@users.noreply.github.com>	2023-01-31 22:44:45 +00:00
Nick Fitzgerald	253e28ca4f	Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits (#5673 ) * Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits * Add a runtest for exercising our rewrite of `(x >> k) << k` into masking	2023-01-31 21:11:12 +00:00
Nick Fitzgerald	7aa240e0f2	Cranelift: constant propagate shifts (#5671 ) Thanks to Souper for pointing out we weren't doing this!	2023-01-31 12:06:53 -08:00
Nick Fitzgerald	c9d1c068bc	Cranelift: Add egraph rule to rewrite `x * C ==> x << log2(C)` when `C` is a power of two (#5647 )	2023-01-31 18:04:17 +00:00
Trevor Elliott	a5698cedf8	cranelift: Remove brz and brnz (#5630 ) Remove the brz and brnz instructions, as their behavior is now redundant with brif.	2023-01-30 20:34:56 +00:00
yuyang	77cf547f41	fix issue 5569. (#5612 ) * add regression test file. * fix issute5569. * enable code length check.	2023-01-30 10:01:33 -08:00
Jamey Sharp	915801551b	Delete old cranelift-preopt crate (#5642 ) Most of these optimizations are in the egraph `cprop.isle` rules now, making a separate crate unnecessary. Also I think the `udiv` optimizations here are straight-up wrong (doing signed instead of unsigned division, and panicking instead of preserving traps on division by zero) so I'm guessing this crate isn't seriously used anywhere. At the least, bjorn3 confirms that cg_clif doesn't use this, and I've verified that Wasmtime doesn't either. Closes #1090.	2023-01-26 21:32:33 +00:00
Trevor Elliott	7926808e8e	riscv64: improve unordered comparison generated code (#5636 ) Improve the generated code for unordered floating point comparisons by negating the comparison and inveritng the branches. This allows us to pick the unordered versions, which generate significantly better code.	2023-01-25 17:28:28 -08:00
Trevor Elliott	b58a197d33	cranelift: Add a conditional branch instruction with two targets (#5446 ) Add a conditional branch instruction with two targets: brif. This instruction will eventually replace brz and brnz, as it encompasses the behavior of both. This PR also changes the InstructionData layout for instruction formats that hold BlockCall values, taking the same approach we use for Value arguments. This allows branch_destination to return a slice to the BlockCall values held in the instruction, rather than requiring that we pattern match on InstructionData to fetch the then/else blocks. Function generation for fuzzing has been updated to generate uses of brif, and I've run the cranelift-fuzzgen target locally for hours without triggering any new failures.	2023-01-24 14:37:16 -08:00
Jamey Sharp	fef9f64d2c	x86: Test paired udiv/urem (#5573 ) Ideally these pairs of CLIF instructions should emit a single x86 instruction, but they don't today. This test will tell us if somebody fixes that. Similar tests might make sense for imul/umulhi as well as signed versions, but I haven't tried that.	2023-01-23 11:44:27 -08:00
yuyang	7e10bd1f58	fix issue #5497 #5524 #5526 . (#5595 ) * fix issue 5497. * fix issue 5524 * fix issue 5497 5524 5526. * some clif change because of reg alloc.	2023-01-20 14:06:26 -08:00
yuyang	299b8187f8	fix issue 5525. (#5603 ) * fix issue 5525. * reg alloc changed.	2023-01-20 09:53:54 -08:00
Chris Fallin	1faff8c2ce	Enable egraph-based optimization by default. (#5587 ) This PR follows up on #5382 and #5391, which rebuilt the egraph-based optimization framework to be more performant, by enabling it by default. Based on performance results in #5382 (my measurements on SpiderMonkey and bjorn3's independent confirmation with cg_clif), it seems that this is reasonable to enable. Now that we have been fuzzing compiler configurations with egraph opts (#5388) for 6 weeks, having fixed a few fuzzbugs that came up (#5409, #5420, #5438) and subsequently received no further reports from OSS-Fuzz, I believe it is stable enough to rely on. This PR enables `use_egraphs`, and also normalizes its meaning: previously it forced optimization (it basically meant "turn on the egraph optimization machinery"), now it runs egraph opts if the opt level indicates (it means "use egraphs to optimize if we are going to optimize"). The conditionals in the top-level pass driver are a little subtle, but will get simpler once we can remove the non-egraph path (which we plan to do eventually!). Fixes #5181.	2023-01-19 15:46:53 -08:00
Chris Fallin	704f5a5772	Cranelift/egraph mid-end: support merging effectful-but-idempotent ops (#5594 ) * Support mergeable-but-side-effectful (idempotent) operations in general in the egraph's GVN. This mirrors the similar change made in #5534. * Add tests for egraph case.	2023-01-19 11:51:19 -08:00
Kevin Rizzo	da03ff47f1	winch: Adding support for integration tests (#5588 ) * Adding in the foundations for Winch `filetests` This commit adds two new crates into the Winch workspace: `filetests` and `test-macros`. The intent is to mimic the structure of Cranelift `filetests`, but in a simpler way. * Updates to documentation This commits adds a high level document to outline how to test Winch through the `winch-tools` utility. It also updates some inline documentation which gets propagated to the CLI. * Updating test-macro to use a glob instead of only a flat directory	2023-01-19 07:34:48 -05:00
Afonso Bordado	3ae373b073	cranelift: Disable select rule for i128 types on riscv64 (#5584 ) * fuzzgen: Disable some selects for RISC-V * cranelift: Force disable gen_select_reg rule for i128 values	2023-01-17 10:01:23 -08:00
Afonso Bordado	82494661c1	cranelift: Add `atomic_{load,store}` and `fence` to the interpreter (#5503 ) * cranelift: Add `fence` to interpreter * cranelift: Add `atomic_{load,store}` to the interpreter * fuzzgen: Add `atomic_{load,store}` * Update cranelift/fuzzgen/src/function_generator.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * fuzzgen: Use type size as the alignment size. Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-01-12 08:36:04 -08:00
Afonso Bordado	9556cb190f	cranelift: Forbid argument extensions for floats and SIMD vectors (#5536 ) * fuzzgen: Generate argument extensions only for integer argumetns * cranelift: Add verifier check for argument extensions	2023-01-10 10:26:30 -08:00
Alexa VanHattum	44913825b5	cranelift: fix register for `srem.i8` on x86_64 (#5540 ) * Change register written to in specific srem case. Add regression test as filetest case. Fixes #5470 * Add another test case, newline * Update comment	2023-01-06 22:18:16 +00:00
Sam Sartor	1efa3d6f8b	Add `clif-util compile` option to output object file (#5493 ) * add clif-util compile option to output object file * switch from a box to a borrow * update objectmodule tests to use borrowed isa * put targetisa into an arc	2023-01-06 12:53:48 -08:00
uint256_t	b00455135e	Cranelift: Implement 'iabs' for scalar types on x86_64 (#5527 ) * Implement 'iabs' for scalar types on x86_64 * Small fix	2023-01-05 21:33:12 -08:00
Nick Fitzgerald	c50bdf600e	Cranelift: GVN all idempotently trapping but otherwise pure instructions (#5534 )	2023-01-05 15:08:06 -08:00
Afonso Bordado	ee6a909ccb	cranelift: Cleanup SIMD `icmp` tests (#5530 ) * cranelift: Enable more SIMD tests * cranelift: Reorganize icmp tests * cranelift: Enable SIMD icmp tests for unsigned ops * cranelift: Cleanup trailing newlines	2023-01-05 09:19:03 -08:00
Nick Fitzgerald	f4a2d5337a	Cranelift: GVN `uadd_overflow_trap` (#5520 ) * Switch duplicate loads w/ dynamic memories test to `min_size = 0` This test was accidentally hitting a special case for bounds checks for when we know that `offset + access_size < min_size` and can skip some steps. This commit changes the `min_size` of the memory to zero so that we are forced to do fully general bounds checks. * Cranelift: Mark `uadd_overflow_trap` as okay for GVN Although this improves our test sequence for duplicate loads with dynamic memories, it unfortunately doesn't have any effect on sightglass benchmarks: ``` instantiation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [34448 35607.23 37158] gvn_uadd_overflow_trap.so [34566 35734.05 36585] main.so instantiation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [44101 60449.62 92712] gvn_uadd_overflow_trap.so [44011 60436.37 92690] main.so instantiation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [35595 36675.72 38153] gvn_uadd_overflow_trap.so [35440 36670.42 37993] main.so compilation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [17370195 17405125.62 17471222] gvn_uadd_overflow_trap.so [17369324 17404859.43 17470725] main.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [7055720520 7055886880.32 7056265930] gvn_uadd_overflow_trap.so [7055719554 7055843809.33 7056193289] main.so compilation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [683589861 683767276.00 684098366] gvn_uadd_overflow_trap.so [683590024 683767998.02 684097885] main.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [46436883 46437135.10 46437823] gvn_uadd_overflow_trap.so [46436883 46437087.67 46437785] main.so compilation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [126522461 126565812.58 126647044] gvn_uadd_overflow_trap.so [126522176 126565757.75 126647522] main.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [653010531 653010533.03 653010544] gvn_uadd_overflow_trap.so [653010531 653010533.18 653010537] main.so ``` * cranelift-codegen-meta: Rename `side_effects_okay_for_gvn` to `side_effects_idempotent` * cranelift-filetests: Ensure there is a trailing newline for blessed Wasm tests	2023-01-04 22:03:16 -08:00
Nick Fitzgerald	937601c7c3	Cranelift: GVN spectre guards and run redundant load elimination twice (#5517 ) * Cranelift: Make spectre guards GVN-able While these instructions have a side effect that is otherwise invisible to the optimizer, the side effect in question is idempotent, so it can be de-duplicated by GVN. * Cranelift: Run redundant load replacement and GVN twice This allows us to actually replace redundant Wasm loads with dynamic memories. While this improves our hand-crafted test sequences, it doesn't seem to have any improvement on sightglass benchmarks run with dynamic memories, however it also isn't a hit to compilation times, so seems generally good to land anyways: ``` $ cargo run --release -- benchmark -e ~/scratch/once.so -e ~/scratch/twice.so -m insts-retired --processes 20 --iterations-per-process 3 --engine-flags="--static-memory-maximum-size 0" -- benchmarks/default.suite compilation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [683595240 683768610.53 684097577] once.so [683597068 700115966.83 1664907164] twice.so instantiation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [44107 60411.07 92785] once.so [44138 59552.32 92097] twice.so compilation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [17369916 17404839.78 17471458] once.so [17369935 17625713.87 30700150] twice.so compilation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [126523640 126566170.80 126648265] once.so [126523076 127174580.30 163145149] twice.so instantiation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [34569 35686.25 36513] once.so [34651 35749.97 36953] twice.so instantiation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [35146 36639.10 37707] once.so [34472 36580.82 38431] twice.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [7055720115 7055841324.82 7056180024] once.so [7055717681 7055877095.85 7056225217] twice.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [46436881 46437081.28 46437691] once.so [46436883 46437127.68 46437766] twice.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [653010530 653010533.27 653010539] once.so [653010531 653010532.95 653010538] twice.so ```	2023-01-04 20:05:43 +00:00
Trevor Elliott	b2d5afdf83	riscv64: Implement fcmp in ISLE (#5512 ) Rework the compilation of fcmp in the riscv64 backend to be in ISLE, removing the need for the dedicated Fcmp instruction. This change is motivated by #5500, which showed that the riscv64 backend was generating branch instructions in the middle of a basic block. We can't remove lower_br_fcmp quite yet as it's used in a few places in the emit module, but it's now no longer reachable from the ISLE lowerings. Fixes #5500	2023-01-04 11:52:00 -08:00
Nick Fitzgerald	d1920f5a2d	cranelift: Add wasm tests for duplicate loads (#5514 ) * cranelift-filetests: Add the ability to test optimized CLIF in Wasm tests * cranelift: Add Wasm tests for identical loads, back to back	2023-01-04 18:52:32 +00:00
Afonso Bordado	52ba72f341	riscv64: Fix masking on `iabs` (#5505 ) * cranelift: Add `iabs.i128` runtest * riscv64: Fix incorrect extension in iabs When lowering iabs, we were accidentally comparing the unextended value this caused the instruction to misbehave with certain top bits. This commit also adds a zbb lowering that does not use jumps.	2023-01-03 17:37:25 -08:00
Afonso Bordado	7e94704264	riscv64: Add masking for small types when lowering select (#5504 ) When lowering `select+icmp` we have an optimization that allows us to avoid materializing the icmp result. We were accidentally not masking the high bits for i8 and i16 in this case. Issue #5498 reported this as an illegal instruction but what was happening there was that the invalid select caused a division by zero.	2023-01-03 19:59:14 +00:00
Afonso Bordado	c9c7d4991c	riscv64: Fix br-table segfault with zero sized jump tables (#5508 ) We had a off-by-one bounds check error when checking if we should jump to the default block in a br-table. Instead of always jumping to the default block when we have a jump table with 0 targets we would try to compute an offset past the end of the table. This sometimes would not crash, but it would crash if the there was no block after the br_table, thus adding a cold block would cause a segfault. The actual fix is quite simple, do not count the default block as a jump table entry when computing the limits. This commit also does a bunch of cleanup and adding some comments to the br_table emission code.	2023-01-03 10:22:48 -08:00
Mrmaxmeier	fe992c2627	Cranelift: aarch64: lower umin.i64 and friends (#5495 ) * Cranelift: aarch64: lower umin.i64 and friends * fuzzgen: Enable integer-min/max for aarch64	2022-12-29 18:03:31 -08:00
Chris Fallin	03463458e4	Cranelift: fix branch-of-icmp/fcmp regression: look through `uextend`. (#5487 ) In #5031, we removed `bool` types from CLIF, using integers instead for "truthy" values. This greatly simplified the IR, and was generally an improvement. However, because x86's `SETcc` instruction sets only the low 8 bits of a register, we chose to use `i8` types as the result of `icmp` and `fcmp`, to avoid the need for a masking operation when materializing the result. Unfortunately this means that uses of truthy values often now have `uextend` operations, especially when coming from Wasm (where truthy values are naturally `i32`-typed). For example, where we previously had `(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`. It's arguable whether or not we should switch to `i32` truthy values -- in most cases we can avoid materializing a value that's immediately used for a branch or select, so a mask would in most cases be unnecessary, and it would be a win at the IR level -- but irrespective of that, this change did regress our generated code quality: our backends had patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we were always materializing truthy values. Many blocks thus ended with "cmp; setcc; cmp; test; branch" rather than "cmp; branch". In #5391 we noticed this and fixed it on x64, but it was a general problem on aarch64 and riscv64 as well. This PR introduces a `maybe_uextend` extractor that "looks through" uextends, and uses it where we consume truthy values, thus fixing the regression. This PR also adds compile filetests to ensure we don't regress again. The riscv64 backend has not been updated here because doing so appears to trigger another issue in its branch handling; fixing that is TBD.	2022-12-22 01:43:44 -08:00
Ayomide Bamidele	b47e644c3d	Remove vconcat and vsplit clif instructions (#5465 ) Fixes #5463. * remove vsplit instruction * remove vconcat instruction * remove unsused half/double vector helper functions * remove unused operand constraints * delete + inline Type::half_vector method	2022-12-20 00:41:55 +00:00
Ayomide Bamidele	93ae9078c5	Implement vsplit in cranelift interpreter (#5462 ) * Add vsplit testfile * Add vsplit implementation	2022-12-16 23:14:56 +00:00
Chris Fallin	22439f7b39	support select_spectre_guard and select on i128 conditions on all platforms. (#5460 ) Fixes #5199. Fixes #5200. Fixes #5452. Fixes #5453. On riscv64, there is apparently an autoconversion from `ValueRegs` to `Reg` that takes just the low register [0], and removing this conversion causes 48 errors. As a result of this, `select` with an `i128` condition was silently miscompiling, testing only the low 64 bits. We should remove this autoconversion to ensure we aren't missing any other silent truncations, but for now this PR just adds the explicit `I128` logic for `select` / `select_spectre_guard`. [0] `d9fdbfd50e/cranelift/codegen/src/isa/riscv64/inst.isle (L1762)`	2022-12-16 14:18:22 -08:00
Nick Fitzgerald	c0b587ac5f	Remove heaps from core Cranelift, push them into `cranelift-wasm` (#5386 ) * cranelift-wasm: translate Wasm loads into lower-level CLIF operations Rather than using `heap_{load,store,addr}`. * cranelift: Remove the `heap_{addr,load,store}` instructions These are now legalized in the `cranelift-wasm` frontend. * cranelift: Remove the `ir::Heap` entity from CLIF * Port basic memory operation tests to .wat filetests * Remove test for verifying CLIF heaps * Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test * Remove `heap_addr` from readonly.clif test * Remove `heap_addr` from `table_addr.clif` test * Remove `heap_addr` from the simd-fvpromote_low.clif test * Remove `heap_addr` from simd-fvdemote.clif test * Remove `heap_addr` from the load-op-store.clif test * Remove the CLIF heap runtest * Remove `heap_addr` from the global_value.clif test * Remove `heap_addr` from fpromote.clif runtests * Remove `heap_addr` from fdemote.clif runtests * Remove `heap_addr` from memory.clif parser test * Remove `heap_addr` from reject_load_readonly.clif test * Remove `heap_addr` from reject_load_notrap.clif test * Remove `heap_addr` from load_readonly_notrap.clif test * Remove `static-heap-without-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` to generating `.wat` tests. * Remove `static-heap-with-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove more heap tests These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove `heap_addr` from `simple-alias.clif` test * Remove `heap_addr` from partial-redundancy.clif test * Remove `heap_addr` from multiple-blocks.clif test * Remove `heap_addr` from fence.clif test * Remove `heap_addr` from extends.clif test * Remove runtests that rely on heaps Heaps are not a thing in CLIF or the interpreter anymore * Add generated load/store `.wat` tests * Enable memory-related wasm features in `.wat` tests * Remove CLIF heap from fcmp-mem-bug.clif test * Add a mode for compiling `.wat` all the way to assembly in filetests * Also generate WAT to assembly tests in `make-load-store-tests.sh` * cargo fmt * Reinstate `f{de,pro}mote.clif` tests without the heap bits * Remove undefined doc link * Remove outdated SVG and dot file from docs * Add docs about `None` returns for base address computation helpers * Factor out `env.heap_access_spectre_mitigation()` to a local * Expand docs for `FuncEnvironment::heaps` trait method * Restore f{de,pro}mote+load clif runtests with stack memory	2022-12-15 00:26:45 +00:00
Nick Fitzgerald	be710df237	Cranelift: Add `.wat` to assembly test support and generate Wasm load/store tests for all ISAs (#5439 ) * cranelift-filetest: Add the ability to test `.wat` to assembly * Make the load/store test case generator script use `.wat` tests And generate tests that exercise both Wasm-to-CLIF lowering and Wasm all the way to assembly. * Remove old versions of generated load/store tests * Add new generated load/store tests * Fix filename reference in script	2022-12-14 21:13:43 +00:00
Trevor Elliott	9dc4f1a83c	s390x: Move the value out of the casloop_val_reg with mov_preg (#5430 ) The casloop_emit function in the s390x backend was using the fixed non-allocatable register %r0 directly with move instructions, which produced a panic in the regalloc2 checker (#5425). This PR changes the casloop_result function to use mov_preg instead of copy_reg to fetch the result, as it's not viewed by regalloc2 as a move. Fixes #5425	2022-12-14 13:06:35 -08:00
Chris Fallin	8383e4b6bd	egraph opt rules: do `(icmp cc x x) == {0,1}` only for integer types. (#5438 ) We could do these for vectors too, in theory, but for now let's fix the bug by applying the equivalence only for integer types. Fixes #5437.	2022-12-14 19:50:42 +00:00
Ulrich Weigand	df923f18ca	Remove MachInst::gen_constant (#5427 ) * aarch64: constant generation cleanup Add support for MOVZ and MOVN generation via ISLE. Handle f32const, f64const, and nop instructions via ISLE. No longer call Inst::gen_constant from lower.rs. * riscv64: constant generation cleanup Handle f32const, f64const, and nop instructions via ISLE. * s390x: constant generation cleanup Fix rule priorities for "imm" term. Only handle 32-bit stack offsets; no longer use load_constant64. * x64: constant generation cleanup No longer call Inst::gen_constant from lower.rs or abi.rs. * Refactor LowerBackend::lower to return InstOutput No longer write to the per-insn output registers; instead, return an InstOutput vector of temp registers holding the outputs. This will allow calling LowerBackend::lower multiple times for the same instruction, e.g. to rematerialize constants. When emitting the primary copy of the instruction during lowering, writing to the per-insn registers is now done in lower_clif_block. As a result, the ISLE lower_common routine is no longer needed. In addition, the InsnOutput type and all code related to it can be removed as well. * Refactor IsleContext to hold a LowerBackend reference Remove the "triple", "flags", and "isa_flags" fields that are copied from LowerBackend to each IsleContext, and instead just hold a reference to LowerBackend in IsleContext. This will allow calling LowerBackend::lower from within callbacks in src/machinst/isle.rs, e.g. to rematerialize constants. To avoid having to pass LowerBackend references through multiple functions, eliminate the lower_insn_to_regs subroutines in those targets that still have them, and just inline into the main lower routine. This also eliminates lower_inst.rs on aarch64 and riscv64. Replace all accesses to the removed IsleContext fields by going through the LowerBackend reference. * Remove MachInst::gen_constant This addresses the problem described in issue https://github.com/bytecodealliance/wasmtime/issues/4426 that targets currently have to duplicate code to emit constants between the ISLE logic and the gen_constant callback. After the various cleanups in earlier patches in this series, the only remaining user of get_constant is put_value_in_regs in Lower. This can now be removed, and instead constant rematerialization can be performed in the put_in_regs ISLE callback by simply directly calling LowerBackend::lower on the instruction defining the constant (using a different output register). Since the check for egraph mode is now no longer performed in put_value_in_regs, the Lower::flags member becomes obsolete. Care needs to be taken that other calls directly to the Lower::put_value_in_regs routine now handle the fact that no more rematerialization is performed. All such calls in target code already historically handle constants themselves. The remaining call site in the ISLE gen_call_common helper can be redirected to the ISLE put_in_regs callback. The existing target implementations of gen_constant are then unused and can be removed. (In some target there may still be further opportunities to remove duplication between ISLE and some local Rust code - this can be left to future patches.)	2022-12-13 13:00:04 -08:00
Chris Fallin	92ce79366c	riscv64: remove `valueregs_2_reg` extractor. (#5426 ) This extractor had a side-effect of invoking `put_in_regs`, which is not supposed to be invoked until the pattern-matching commits to evaluating a rule right-hand side (i.e., cannot backtrack). In this case the side-effect was mostly benign (in theory it could have caused additional values to be computed needlessly), but in general we should be careful to keep side-effects out of the left-hand side to enable further optimizations and work on islec. The implicit conversion from `Value` to `Reg` turns out to be enough to make the rules in question work, so we can simply remove the use of the extractor in this case.	2022-12-13 11:47:20 -08:00
Trevor Elliott	a5ecb5e647	x64: Share a zero in the ushr translation on x64 to free up a register (#5424 ) Share a zero value in the translation of ushr for i128. This increases the lifetime of the value by a few instructions, and reduces the number of registers used in the translation by one, which seems like an acceptable trade-off.	2022-12-12 18:15:43 -08:00

1 2 3 4 5 ...

1482 Commits