wasmtime

Author	SHA1	Message	Date
yuyang	0c66a1bba7	Fix issue 5528 (#5605 ) * fix parameter error. * fix float convert to i8 and i16 should extract sign bit. * add missing regression test file. * using tmp register. * float convert i8 will consume more instructions. * fix worse inst emit size. * fix worst_case_size.	2023-01-31 15:37:36 -08:00
Nick Fitzgerald	8c9eb9939b	Cranelift: Rewrite `or(and(x, y), not(y)) => or(x, not(y))` (#5676 ) Co-authored-by: Rainy Sinclair <844493+itsrainy@users.noreply.github.com>	2023-01-31 22:44:45 +00:00
Nick Fitzgerald	253e28ca4f	Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits (#5673 ) * Cranelift: Rewrite `(x>>k)<<k` into masking off the bottom `k` bits * Add a runtest for exercising our rewrite of `(x >> k) << k` into masking	2023-01-31 21:11:12 +00:00
Nick Fitzgerald	7aa240e0f2	Cranelift: constant propagate shifts (#5671 ) Thanks to Souper for pointing out we weren't doing this!	2023-01-31 12:06:53 -08:00
Nick Fitzgerald	c9d1c068bc	Cranelift: Add egraph rule to rewrite `x * C ==> x << log2(C)` when `C` is a power of two (#5647 )	2023-01-31 18:04:17 +00:00
Trevor Elliott	a5698cedf8	cranelift: Remove brz and brnz (#5630 ) Remove the brz and brnz instructions, as their behavior is now redundant with brif.	2023-01-30 20:34:56 +00:00
yuyang	77cf547f41	fix issue 5569. (#5612 ) * add regression test file. * fix issute5569. * enable code length check.	2023-01-30 10:01:33 -08:00
Jamey Sharp	915801551b	Delete old cranelift-preopt crate (#5642 ) Most of these optimizations are in the egraph `cprop.isle` rules now, making a separate crate unnecessary. Also I think the `udiv` optimizations here are straight-up wrong (doing signed instead of unsigned division, and panicking instead of preserving traps on division by zero) so I'm guessing this crate isn't seriously used anywhere. At the least, bjorn3 confirms that cg_clif doesn't use this, and I've verified that Wasmtime doesn't either. Closes #1090.	2023-01-26 21:32:33 +00:00
Trevor Elliott	7926808e8e	riscv64: improve unordered comparison generated code (#5636 ) Improve the generated code for unordered floating point comparisons by negating the comparison and inveritng the branches. This allows us to pick the unordered versions, which generate significantly better code.	2023-01-25 17:28:28 -08:00
Trevor Elliott	b58a197d33	cranelift: Add a conditional branch instruction with two targets (#5446 ) Add a conditional branch instruction with two targets: brif. This instruction will eventually replace brz and brnz, as it encompasses the behavior of both. This PR also changes the InstructionData layout for instruction formats that hold BlockCall values, taking the same approach we use for Value arguments. This allows branch_destination to return a slice to the BlockCall values held in the instruction, rather than requiring that we pattern match on InstructionData to fetch the then/else blocks. Function generation for fuzzing has been updated to generate uses of brif, and I've run the cranelift-fuzzgen target locally for hours without triggering any new failures.	2023-01-24 14:37:16 -08:00
Jamey Sharp	fef9f64d2c	x86: Test paired udiv/urem (#5573 ) Ideally these pairs of CLIF instructions should emit a single x86 instruction, but they don't today. This test will tell us if somebody fixes that. Similar tests might make sense for imul/umulhi as well as signed versions, but I haven't tried that.	2023-01-23 11:44:27 -08:00
yuyang	7e10bd1f58	fix issue #5497 #5524 #5526 . (#5595 ) * fix issue 5497. * fix issue 5524 * fix issue 5497 5524 5526. * some clif change because of reg alloc.	2023-01-20 14:06:26 -08:00
yuyang	299b8187f8	fix issue 5525. (#5603 ) * fix issue 5525. * reg alloc changed.	2023-01-20 09:53:54 -08:00
Chris Fallin	1faff8c2ce	Enable egraph-based optimization by default. (#5587 ) This PR follows up on #5382 and #5391, which rebuilt the egraph-based optimization framework to be more performant, by enabling it by default. Based on performance results in #5382 (my measurements on SpiderMonkey and bjorn3's independent confirmation with cg_clif), it seems that this is reasonable to enable. Now that we have been fuzzing compiler configurations with egraph opts (#5388) for 6 weeks, having fixed a few fuzzbugs that came up (#5409, #5420, #5438) and subsequently received no further reports from OSS-Fuzz, I believe it is stable enough to rely on. This PR enables `use_egraphs`, and also normalizes its meaning: previously it forced optimization (it basically meant "turn on the egraph optimization machinery"), now it runs egraph opts if the opt level indicates (it means "use egraphs to optimize if we are going to optimize"). The conditionals in the top-level pass driver are a little subtle, but will get simpler once we can remove the non-egraph path (which we plan to do eventually!). Fixes #5181.	2023-01-19 15:46:53 -08:00
Chris Fallin	704f5a5772	Cranelift/egraph mid-end: support merging effectful-but-idempotent ops (#5594 ) * Support mergeable-but-side-effectful (idempotent) operations in general in the egraph's GVN. This mirrors the similar change made in #5534. * Add tests for egraph case.	2023-01-19 11:51:19 -08:00
Kevin Rizzo	da03ff47f1	winch: Adding support for integration tests (#5588 ) * Adding in the foundations for Winch `filetests` This commit adds two new crates into the Winch workspace: `filetests` and `test-macros`. The intent is to mimic the structure of Cranelift `filetests`, but in a simpler way. * Updates to documentation This commits adds a high level document to outline how to test Winch through the `winch-tools` utility. It also updates some inline documentation which gets propagated to the CLI. * Updating test-macro to use a glob instead of only a flat directory	2023-01-19 07:34:48 -05:00
Afonso Bordado	3ae373b073	cranelift: Disable select rule for i128 types on riscv64 (#5584 ) * fuzzgen: Disable some selects for RISC-V * cranelift: Force disable gen_select_reg rule for i128 values	2023-01-17 10:01:23 -08:00
Afonso Bordado	82494661c1	cranelift: Add `atomic_{load,store}` and `fence` to the interpreter (#5503 ) * cranelift: Add `fence` to interpreter * cranelift: Add `atomic_{load,store}` to the interpreter * fuzzgen: Add `atomic_{load,store}` * Update cranelift/fuzzgen/src/function_generator.rs Co-authored-by: Jamey Sharp <jamey@minilop.net> * fuzzgen: Use type size as the alignment size. Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-01-12 08:36:04 -08:00
Afonso Bordado	9556cb190f	cranelift: Forbid argument extensions for floats and SIMD vectors (#5536 ) * fuzzgen: Generate argument extensions only for integer argumetns * cranelift: Add verifier check for argument extensions	2023-01-10 10:26:30 -08:00
Alexa VanHattum	44913825b5	cranelift: fix register for `srem.i8` on x86_64 (#5540 ) * Change register written to in specific srem case. Add regression test as filetest case. Fixes #5470 * Add another test case, newline * Update comment	2023-01-06 22:18:16 +00:00
Sam Sartor	1efa3d6f8b	Add `clif-util compile` option to output object file (#5493 ) * add clif-util compile option to output object file * switch from a box to a borrow * update objectmodule tests to use borrowed isa * put targetisa into an arc	2023-01-06 12:53:48 -08:00
uint256_t	b00455135e	Cranelift: Implement 'iabs' for scalar types on x86_64 (#5527 ) * Implement 'iabs' for scalar types on x86_64 * Small fix	2023-01-05 21:33:12 -08:00
Nick Fitzgerald	c50bdf600e	Cranelift: GVN all idempotently trapping but otherwise pure instructions (#5534 )	2023-01-05 15:08:06 -08:00
Afonso Bordado	ee6a909ccb	cranelift: Cleanup SIMD `icmp` tests (#5530 ) * cranelift: Enable more SIMD tests * cranelift: Reorganize icmp tests * cranelift: Enable SIMD icmp tests for unsigned ops * cranelift: Cleanup trailing newlines	2023-01-05 09:19:03 -08:00
Nick Fitzgerald	f4a2d5337a	Cranelift: GVN `uadd_overflow_trap` (#5520 ) * Switch duplicate loads w/ dynamic memories test to `min_size = 0` This test was accidentally hitting a special case for bounds checks for when we know that `offset + access_size < min_size` and can skip some steps. This commit changes the `min_size` of the memory to zero so that we are forced to do fully general bounds checks. * Cranelift: Mark `uadd_overflow_trap` as okay for GVN Although this improves our test sequence for duplicate loads with dynamic memories, it unfortunately doesn't have any effect on sightglass benchmarks: ``` instantiation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [34448 35607.23 37158] gvn_uadd_overflow_trap.so [34566 35734.05 36585] main.so instantiation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [44101 60449.62 92712] gvn_uadd_overflow_trap.so [44011 60436.37 92690] main.so instantiation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [35595 36675.72 38153] gvn_uadd_overflow_trap.so [35440 36670.42 37993] main.so compilation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [17370195 17405125.62 17471222] gvn_uadd_overflow_trap.so [17369324 17404859.43 17470725] main.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [7055720520 7055886880.32 7056265930] gvn_uadd_overflow_trap.so [7055719554 7055843809.33 7056193289] main.so compilation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [683589861 683767276.00 684098366] gvn_uadd_overflow_trap.so [683590024 683767998.02 684097885] main.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [46436883 46437135.10 46437823] gvn_uadd_overflow_trap.so [46436883 46437087.67 46437785] main.so compilation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [126522461 126565812.58 126647044] gvn_uadd_overflow_trap.so [126522176 126565757.75 126647522] main.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [653010531 653010533.03 653010544] gvn_uadd_overflow_trap.so [653010531 653010533.18 653010537] main.so ``` * cranelift-codegen-meta: Rename `side_effects_okay_for_gvn` to `side_effects_idempotent` * cranelift-filetests: Ensure there is a trailing newline for blessed Wasm tests	2023-01-04 22:03:16 -08:00
Nick Fitzgerald	937601c7c3	Cranelift: GVN spectre guards and run redundant load elimination twice (#5517 ) * Cranelift: Make spectre guards GVN-able While these instructions have a side effect that is otherwise invisible to the optimizer, the side effect in question is idempotent, so it can be de-duplicated by GVN. * Cranelift: Run redundant load replacement and GVN twice This allows us to actually replace redundant Wasm loads with dynamic memories. While this improves our hand-crafted test sequences, it doesn't seem to have any improvement on sightglass benchmarks run with dynamic memories, however it also isn't a hit to compilation times, so seems generally good to land anyways: ``` $ cargo run --release -- benchmark -e ~/scratch/once.so -e ~/scratch/twice.so -m insts-retired --processes 20 --iterations-per-process 3 --engine-flags="--static-memory-maximum-size 0" -- benchmarks/default.suite compilation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [683595240 683768610.53 684097577] once.so [683597068 700115966.83 1664907164] twice.so instantiation :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [44107 60411.07 92785] once.so [44138 59552.32 92097] twice.so compilation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [17369916 17404839.78 17471458] once.so [17369935 17625713.87 30700150] twice.so compilation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [126523640 126566170.80 126648265] once.so [126523076 127174580.30 163145149] twice.so instantiation :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [34569 35686.25 36513] once.so [34651 35749.97 36953] twice.so instantiation :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [35146 36639.10 37707] once.so [34472 36580.82 38431] twice.so execution :: instructions-retired :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [7055720115 7055841324.82 7056180024] once.so [7055717681 7055877095.85 7056225217] twice.so execution :: instructions-retired :: benchmarks/pulldown-cmark/benchmark.wasm No difference in performance. [46436881 46437081.28 46437691] once.so [46436883 46437127.68 46437766] twice.so execution :: instructions-retired :: benchmarks/bz2/benchmark.wasm No difference in performance. [653010530 653010533.27 653010539] once.so [653010531 653010532.95 653010538] twice.so ```	2023-01-04 20:05:43 +00:00
Trevor Elliott	b2d5afdf83	riscv64: Implement fcmp in ISLE (#5512 ) Rework the compilation of fcmp in the riscv64 backend to be in ISLE, removing the need for the dedicated Fcmp instruction. This change is motivated by #5500, which showed that the riscv64 backend was generating branch instructions in the middle of a basic block. We can't remove lower_br_fcmp quite yet as it's used in a few places in the emit module, but it's now no longer reachable from the ISLE lowerings. Fixes #5500	2023-01-04 11:52:00 -08:00
Nick Fitzgerald	d1920f5a2d	cranelift: Add wasm tests for duplicate loads (#5514 ) * cranelift-filetests: Add the ability to test optimized CLIF in Wasm tests * cranelift: Add Wasm tests for identical loads, back to back	2023-01-04 18:52:32 +00:00
Afonso Bordado	52ba72f341	riscv64: Fix masking on `iabs` (#5505 ) * cranelift: Add `iabs.i128` runtest * riscv64: Fix incorrect extension in iabs When lowering iabs, we were accidentally comparing the unextended value this caused the instruction to misbehave with certain top bits. This commit also adds a zbb lowering that does not use jumps.	2023-01-03 17:37:25 -08:00
Afonso Bordado	7e94704264	riscv64: Add masking for small types when lowering select (#5504 ) When lowering `select+icmp` we have an optimization that allows us to avoid materializing the icmp result. We were accidentally not masking the high bits for i8 and i16 in this case. Issue #5498 reported this as an illegal instruction but what was happening there was that the invalid select caused a division by zero.	2023-01-03 19:59:14 +00:00
Afonso Bordado	c9c7d4991c	riscv64: Fix br-table segfault with zero sized jump tables (#5508 ) We had a off-by-one bounds check error when checking if we should jump to the default block in a br-table. Instead of always jumping to the default block when we have a jump table with 0 targets we would try to compute an offset past the end of the table. This sometimes would not crash, but it would crash if the there was no block after the br_table, thus adding a cold block would cause a segfault. The actual fix is quite simple, do not count the default block as a jump table entry when computing the limits. This commit also does a bunch of cleanup and adding some comments to the br_table emission code.	2023-01-03 10:22:48 -08:00
Mrmaxmeier	fe992c2627	Cranelift: aarch64: lower umin.i64 and friends (#5495 ) * Cranelift: aarch64: lower umin.i64 and friends * fuzzgen: Enable integer-min/max for aarch64	2022-12-29 18:03:31 -08:00
Chris Fallin	03463458e4	Cranelift: fix branch-of-icmp/fcmp regression: look through `uextend`. (#5487 ) In #5031, we removed `bool` types from CLIF, using integers instead for "truthy" values. This greatly simplified the IR, and was generally an improvement. However, because x86's `SETcc` instruction sets only the low 8 bits of a register, we chose to use `i8` types as the result of `icmp` and `fcmp`, to avoid the need for a masking operation when materializing the result. Unfortunately this means that uses of truthy values often now have `uextend` operations, especially when coming from Wasm (where truthy values are naturally `i32`-typed). For example, where we previously had `(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`. It's arguable whether or not we should switch to `i32` truthy values -- in most cases we can avoid materializing a value that's immediately used for a branch or select, so a mask would in most cases be unnecessary, and it would be a win at the IR level -- but irrespective of that, this change did regress our generated code quality: our backends had patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we were always materializing truthy values. Many blocks thus ended with "cmp; setcc; cmp; test; branch" rather than "cmp; branch". In #5391 we noticed this and fixed it on x64, but it was a general problem on aarch64 and riscv64 as well. This PR introduces a `maybe_uextend` extractor that "looks through" uextends, and uses it where we consume truthy values, thus fixing the regression. This PR also adds compile filetests to ensure we don't regress again. The riscv64 backend has not been updated here because doing so appears to trigger another issue in its branch handling; fixing that is TBD.	2022-12-22 01:43:44 -08:00
Ayomide Bamidele	b47e644c3d	Remove vconcat and vsplit clif instructions (#5465 ) Fixes #5463. * remove vsplit instruction * remove vconcat instruction * remove unsused half/double vector helper functions * remove unused operand constraints * delete + inline Type::half_vector method	2022-12-20 00:41:55 +00:00
Ayomide Bamidele	93ae9078c5	Implement vsplit in cranelift interpreter (#5462 ) * Add vsplit testfile * Add vsplit implementation	2022-12-16 23:14:56 +00:00
Chris Fallin	22439f7b39	support select_spectre_guard and select on i128 conditions on all platforms. (#5460 ) Fixes #5199. Fixes #5200. Fixes #5452. Fixes #5453. On riscv64, there is apparently an autoconversion from `ValueRegs` to `Reg` that takes just the low register [0], and removing this conversion causes 48 errors. As a result of this, `select` with an `i128` condition was silently miscompiling, testing only the low 64 bits. We should remove this autoconversion to ensure we aren't missing any other silent truncations, but for now this PR just adds the explicit `I128` logic for `select` / `select_spectre_guard`. [0] `d9fdbfd50e/cranelift/codegen/src/isa/riscv64/inst.isle (L1762)`	2022-12-16 14:18:22 -08:00
Nick Fitzgerald	c0b587ac5f	Remove heaps from core Cranelift, push them into `cranelift-wasm` (#5386 ) * cranelift-wasm: translate Wasm loads into lower-level CLIF operations Rather than using `heap_{load,store,addr}`. * cranelift: Remove the `heap_{addr,load,store}` instructions These are now legalized in the `cranelift-wasm` frontend. * cranelift: Remove the `ir::Heap` entity from CLIF * Port basic memory operation tests to .wat filetests * Remove test for verifying CLIF heaps * Remove `heap_addr` from replace_branching_instructions_and_cfg_predecessors.clif test * Remove `heap_addr` from readonly.clif test * Remove `heap_addr` from `table_addr.clif` test * Remove `heap_addr` from the simd-fvpromote_low.clif test * Remove `heap_addr` from simd-fvdemote.clif test * Remove `heap_addr` from the load-op-store.clif test * Remove the CLIF heap runtest * Remove `heap_addr` from the global_value.clif test * Remove `heap_addr` from fpromote.clif runtests * Remove `heap_addr` from fdemote.clif runtests * Remove `heap_addr` from memory.clif parser test * Remove `heap_addr` from reject_load_readonly.clif test * Remove `heap_addr` from reject_load_notrap.clif test * Remove `heap_addr` from load_readonly_notrap.clif test * Remove `static-heap-without-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` to generating `.wat` tests. * Remove `static-heap-with-guard-pages.clif` test Will be subsumed when we port `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove more heap tests These will be subsumed by porting `make-heap-load-store-tests.sh` over to `.wat` tests. * Remove `heap_addr` from `simple-alias.clif` test * Remove `heap_addr` from partial-redundancy.clif test * Remove `heap_addr` from multiple-blocks.clif test * Remove `heap_addr` from fence.clif test * Remove `heap_addr` from extends.clif test * Remove runtests that rely on heaps Heaps are not a thing in CLIF or the interpreter anymore * Add generated load/store `.wat` tests * Enable memory-related wasm features in `.wat` tests * Remove CLIF heap from fcmp-mem-bug.clif test * Add a mode for compiling `.wat` all the way to assembly in filetests * Also generate WAT to assembly tests in `make-load-store-tests.sh` * cargo fmt * Reinstate `f{de,pro}mote.clif` tests without the heap bits * Remove undefined doc link * Remove outdated SVG and dot file from docs * Add docs about `None` returns for base address computation helpers * Factor out `env.heap_access_spectre_mitigation()` to a local * Expand docs for `FuncEnvironment::heaps` trait method * Restore f{de,pro}mote+load clif runtests with stack memory	2022-12-15 00:26:45 +00:00
Nick Fitzgerald	be710df237	Cranelift: Add `.wat` to assembly test support and generate Wasm load/store tests for all ISAs (#5439 ) * cranelift-filetest: Add the ability to test `.wat` to assembly * Make the load/store test case generator script use `.wat` tests And generate tests that exercise both Wasm-to-CLIF lowering and Wasm all the way to assembly. * Remove old versions of generated load/store tests * Add new generated load/store tests * Fix filename reference in script	2022-12-14 21:13:43 +00:00
Trevor Elliott	9dc4f1a83c	s390x: Move the value out of the casloop_val_reg with mov_preg (#5430 ) The casloop_emit function in the s390x backend was using the fixed non-allocatable register %r0 directly with move instructions, which produced a panic in the regalloc2 checker (#5425). This PR changes the casloop_result function to use mov_preg instead of copy_reg to fetch the result, as it's not viewed by regalloc2 as a move. Fixes #5425	2022-12-14 13:06:35 -08:00
Chris Fallin	8383e4b6bd	egraph opt rules: do `(icmp cc x x) == {0,1}` only for integer types. (#5438 ) We could do these for vectors too, in theory, but for now let's fix the bug by applying the equivalence only for integer types. Fixes #5437.	2022-12-14 19:50:42 +00:00
Ulrich Weigand	df923f18ca	Remove MachInst::gen_constant (#5427 ) * aarch64: constant generation cleanup Add support for MOVZ and MOVN generation via ISLE. Handle f32const, f64const, and nop instructions via ISLE. No longer call Inst::gen_constant from lower.rs. * riscv64: constant generation cleanup Handle f32const, f64const, and nop instructions via ISLE. * s390x: constant generation cleanup Fix rule priorities for "imm" term. Only handle 32-bit stack offsets; no longer use load_constant64. * x64: constant generation cleanup No longer call Inst::gen_constant from lower.rs or abi.rs. * Refactor LowerBackend::lower to return InstOutput No longer write to the per-insn output registers; instead, return an InstOutput vector of temp registers holding the outputs. This will allow calling LowerBackend::lower multiple times for the same instruction, e.g. to rematerialize constants. When emitting the primary copy of the instruction during lowering, writing to the per-insn registers is now done in lower_clif_block. As a result, the ISLE lower_common routine is no longer needed. In addition, the InsnOutput type and all code related to it can be removed as well. * Refactor IsleContext to hold a LowerBackend reference Remove the "triple", "flags", and "isa_flags" fields that are copied from LowerBackend to each IsleContext, and instead just hold a reference to LowerBackend in IsleContext. This will allow calling LowerBackend::lower from within callbacks in src/machinst/isle.rs, e.g. to rematerialize constants. To avoid having to pass LowerBackend references through multiple functions, eliminate the lower_insn_to_regs subroutines in those targets that still have them, and just inline into the main lower routine. This also eliminates lower_inst.rs on aarch64 and riscv64. Replace all accesses to the removed IsleContext fields by going through the LowerBackend reference. * Remove MachInst::gen_constant This addresses the problem described in issue https://github.com/bytecodealliance/wasmtime/issues/4426 that targets currently have to duplicate code to emit constants between the ISLE logic and the gen_constant callback. After the various cleanups in earlier patches in this series, the only remaining user of get_constant is put_value_in_regs in Lower. This can now be removed, and instead constant rematerialization can be performed in the put_in_regs ISLE callback by simply directly calling LowerBackend::lower on the instruction defining the constant (using a different output register). Since the check for egraph mode is now no longer performed in put_value_in_regs, the Lower::flags member becomes obsolete. Care needs to be taken that other calls directly to the Lower::put_value_in_regs routine now handle the fact that no more rematerialization is performed. All such calls in target code already historically handle constants themselves. The remaining call site in the ISLE gen_call_common helper can be redirected to the ISLE put_in_regs callback. The existing target implementations of gen_constant are then unused and can be removed. (In some target there may still be further opportunities to remove duplication between ISLE and some local Rust code - this can be left to future patches.)	2022-12-13 13:00:04 -08:00
Chris Fallin	92ce79366c	riscv64: remove `valueregs_2_reg` extractor. (#5426 ) This extractor had a side-effect of invoking `put_in_regs`, which is not supposed to be invoked until the pattern-matching commits to evaluating a rule right-hand side (i.e., cannot backtrack). In this case the side-effect was mostly benign (in theory it could have caused additional values to be computed needlessly), but in general we should be careful to keep side-effects out of the left-hand side to enable further optimizations and work on islec. The implicit conversion from `Value` to `Reg` turns out to be enough to make the rules in question work, so we can simply remove the use of the extractor in this case.	2022-12-13 11:47:20 -08:00
Trevor Elliott	a5ecb5e647	x64: Share a zero in the ushr translation on x64 to free up a register (#5424 ) Share a zero value in the translation of ushr for i128. This increases the lifetime of the value by a few instructions, and reduces the number of registers used in the translation by one, which seems like an acceptable trade-off.	2022-12-12 18:15:43 -08:00
Chris Fallin	9397ea1abe	Cranelift: implement general select_spectre_guard fallbacks. (#5420 ) When adding some optimization rules for `icmp` in the egraph infrastructure, we ended up creating a path to legal CLIF but with patterns unsupported by three of our four backends: specifically, `select_spectre_guard` with a general truthy input, rather than an `icmp`. In #5206 we discussed replacing `select_spectre_guard` with something more specific, and that could still be a long-term solution here, but doing so now would interfere with ongoing refactoring of heap access lowering, so I've opted not to do so. (In that issue I was concerned about complexity and didn't see the need but with this fuzzbug I'm starting to feel a bit differently; maybe we should remove this non-orthogonal op in the long run.) Fixes #5417.	2022-12-12 17:13:34 -08:00
Nick Fitzgerald	f2e1eaa847	cranelift-filetest: Add support for Wasm-to-CLIF translation filetests (#5412 ) This adds support for `.wat` tests in `cranelift-filetest`. The test runner translates the WAT to Wasm and then uses `cranelift-wasm` to translate the Wasm to CLIF. These tests are always precise output tests. The test expectations can be updated by running tests with the `CRANELIFT_TEST_BLESS=1` environment variable set, similar to our compile precise output tests. The test's expected output is contained in the last comment in the test file. The tests allow for configuring the kinds of heaps used to implement Wasm linear memory via TOML in a `;;!` comment at the start of the test. To get ISA and Cranelift flags parsing available in the filetests crate, I had to move the `parse_sets_and_triple` helper from the `cranelift-tools` binary crate to the `cranelift-reader` crate, where I think it logically fits. Additionally, I had to make some more bits of `cranelift-wasm`'s dummy environment `pub` so that I could properly wrap and compose it with the environment used for the `.wat` tests. I don't think this is a big deal, but if we eventually want to clean this stuff up, we can probably remove the dummy environments completely, remove `translate_module`, and fold them into these new test environments and test runner (since Wasmtime isn't using those things anyways).	2022-12-12 19:31:29 +00:00
Chris Fallin	244dce93f6	Fix optimization rules for narrow types: wrap i8 results to 8 bits. (#5409 ) * Fix optimization rules for narrow types: wrap i8 results to 8 bits. This fixes #5405. In the egraph mid-end's optimization rules, we were rewriting e.g. imuls of two iconsts to an iconst of the result, but without masking off the high bits (beyond the result type's width). This was producing iconsts with set high bits beyond their types' width, which is not legal. In addition, this PR adds some optimizations to the algebraic rules to recognize e.g. `x == x` (and all other integer comparison operators) and resolve to 1 or 0 as appropriate. * Review feedback. * Review feedback, again.	2022-12-09 22:29:25 +00:00
Ulrich Weigand	e913cf3647	Remove IFLAGS/FFLAGS types (#5406 ) All instructions using the CPU flags types (IFLAGS/FFLAGS) were already removed. This patch completes the cleanup by removing all remaining instructions that define values of CPU flags types, as well as the types themselves. Specifically, the following features are removed: - The IFLAGS and FFLAGS types and the SpecialType category. - Special handling of IFLAGS and FFLAGS in machinst/isle.rs and machinst/lower.rs. - The ifcmp, ifcmp_imm, ffcmp, iadd_ifcin, iadd_ifcout, iadd_ifcarry, isub_ifbin, isub_ifbout, and isub_ifborrow instructions. - The writes_cpu_flags instruction property. - The flags verifier pass. - Flags handling in the interpreter. All of these features are currently unused; no functional change intended by this patch. This addresses https://github.com/bytecodealliance/wasmtime/issues/3249.	2022-12-09 13:42:03 -08:00
Jamey Sharp	8bbd9bb228	aarch64: Test instruction selection for `bmask` (#5396 ) I copied the `bmask` precise-output tests from x64 and used CRANELIFT_TEST_BLESS=1 to generate this test. I don't know aarch64 well enough to decide if this output is correct. However, for I128 it is identical to the previous I128-only precise-output tests, and the existing runtests for bmask pass on aarch64, so I think it's likely correct.	2022-12-08 10:22:23 -08:00
Trevor Elliott	c5379051c4	Enable the ssa verifier in debug builds (#5354 ) Enable regalloc2's SSA verifier in debug builds to check for any outstanding reuse of virtual registers in def constraints. As fuzzing enables debug_assertions, this will enable the SSA verifier when fuzzing as well.	2022-12-07 12:22:51 -08:00
Nick Fitzgerald	f0c4b6f3a1	Cranelift: Implement `iadd_cout` on x64 for 32- and 64-bit integers (#5285 ) * Split the `iadd_cout` runtests by type * Implement `iadd_cout` for 32- and 64-bit values on x64 * Delete trailing whitespace in `riscv/lower.isle`	2022-12-07 19:54:14 +00:00

... 2 3 4 5 6 ...

1525 Commits