wasmtime

Author	SHA1	Message	Date
Jamey Sharp	6cf7155052	Cranelift: Generalize `(x << k) >> k` optimization (#5746 ) * Generalize unsigned `(x << k) >> k` optimization Split the existing rule into three parts: - A dual of the rule for `(x >> k) << k` that is only valid for unsigned shifts. - Known-bits analysis for `(band (uextend x) k)`. - A new rule for converting `sextend` to `uextend` if the sign-extended bits are masked out anyway. The first two together cover the existing rule. * Generalize signed `(x << k) >> k` optimization * Review comments * Generalize sign-extending shifts further The shifts can be eliminated even if the shift amount isn't exactly equal to the difference in bit-widths between the narrow and wide types. * Add filetests	2023-02-27 17:34:46 +00:00
yuyang	3864286596	fix issue 5714. (#5845 ) * fix issue 5714. * add target for regression test. * remove x86_64 test because of not implemented.	2023-02-26 16:25:38 +00:00
Jan-Justin van Tonder	66cb13cb4b	cranelift: Add atomic_cas to interpreter (#5875 ) As per issue #5818, atomic_cas was implemented without specific regard for thread safety.	2023-02-25 14:36:49 +00:00
Afonso Bordado	e9095050be	cranelift-interpreter: Implement `call_indirect` and `return_call_indirect` (#5877 ) * cranelift-interpreter: Implement `call_indirect` * cranelift: Fix typo * riscv64: Enable `call_indirect` tests	2023-02-25 13:16:59 +00:00
Afonso Bordado	36e92add6f	riscv64: Move `is_null`/`is_invalid` to ISLE (#5874 ) * riscv64: Move `is_null`/`is_invalid` to ISLE * riscv64: Fix `is_invalid` codegen * Implement review suggestions Thanks! Co-authored-by: Jamey Sharp <jamey@minilop.net> --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-25 12:48:44 +00:00
Jamey Sharp	5cfb461945	Only emit ISLE/egraph terms for single-value insts (#5848 ) For instructions with no results (such as branches and stores) or instructions with multiple results (such as add with carry), we have assertions checking that an optimization rule doesn't try to match on or construct such instructions. When we generate terms for matching or constructing instructions, the terms for these instructions are guaranteed to panic if they're ever used. So let's just not generate them. In the future we may wish to generate terms with different types for these instructions, to make them usable in ISLE rules for optimization that fall outside our current egraph constraints.	2023-02-24 15:38:48 +00:00
Jamey Sharp	7d790fcdfe	x64: Only branch once in br_table (#5850 ) This uses the `cmov`, which was previously necessary for Spectre mitigation, to clamp the table index instead of zeroing it. By then placing the default target as the last entry in the table, we can use just one branch instruction in all cases. Since there isn't a bounds-check branch any more, this sequence no longer needs Spectre mitigation. And since we don't need to be careful about preserving flags, half the instructions can be removed from this pseudoinstruction and emitted as regular instructions instead. This is a net savings of three bytes in the encoding of x64's br_table pseudoinstruction. The generated code can sometimes be longer overall because the blocks are emitted in a slightly different order. My benchmark results show a very small effect on runtime performance with this change. The spidermonkey benchmark in Sightglass runs "1.01x faster" than main by instructions retired, but with no significant difference in CPU cycles. I think that means it rarely hit the default case in any br_table instructions it executed. The pulldown-cmark benchmark in Sightglass runs "1.01x faster" than main by CPU cycles, but main runs "1.00x faster" by instructions retired. I think that means this benchmark hit the default case a significant amount of the time, so it executes a few more instructions per br_table, but maybe the branches were predicted better.	2023-02-24 04:46:38 +00:00
Trevor Elliott	c5d9d5b10f	Remove module-level code generation tests (#5870 ) * Remove module-level code generation tests * Add cold block tests for each backend * Better cold block tests	2023-02-24 01:19:26 +00:00
Alex Crichton	3fc3bc9ec8	x64: Fill out more AVX instructions (#5849 ) * x64: Fill out more AVX instructions This commit fills out more AVX instructions for SSE counterparts currently used. Many of these instructions do not benefit from the 3-operand form that AVX uses but instead benefit from being able to use `XmmMem` instead of `XmmMemAligned` which may be able to avoid some extra temporary registers in some cases. * Review comments	2023-02-23 22:31:31 +00:00
Trevor Elliott	8abfe928d6	Reuse the DominatorTree postorder travesal in BlockLoweringOrder (#5843 ) * Rework the blockorder module to reuse the dom tree's cfg postorder * Update domtree tests * Treat br_table with an empty jump table as multiple block exits * Bless tests * Change branch_idx to succ_idx and fix the comment	2023-02-23 22:05:20 +00:00
Ulrich Weigand	4314210162	s390x: Fix implementation of {s,u}{min,max} (#5864 ) When expanding a min/max operation to a pair of icmp + select, do not attempt to expand the input value operands twice, as this might fail with memory operands. Fixes https://github.com/bytecodealliance/wasmtime/issues/5859.	2023-02-23 20:01:51 +00:00
Afonso Bordado	fc080c739e	fuzzgen: Add `AtomicRMW` (#5861 )	2023-02-23 18:34:28 +00:00
Ulrich Weigand	9719147f91	s390x: Fix integer overflow during negation (#5866 ) Use wrapping_neg in i{64,32,16}_from_negated_value to avoid Rust aborts due to integer overflow. The resulting INT_MIN is already handled correctly in subsequent operations. Fixes https://github.com/bytecodealliance/wasmtime/issues/5863.	2023-02-23 16:32:10 +00:00
Jan-Justin van Tonder	0521155896	cranelift: Add atomic_rmw to interpreter (#5817 ) (#5856 ) As per the linked issue, atomic_rmw was implemented without specific regard for thread safety. Additionally, the relevant filetest (atomic-rmw-little.clif) was enabled and altered to fix an inccorrect call to test function `%atomic_rmw_and_i64` after setting up test function `%atomic_rmw_and_i32`.	2023-02-23 10:24:56 +00:00
Afonso Bordado	f6c6bc2155	riscv64: Improve signed and zero extend codegen (#5844 ) * riscv64: Remove unused code * riscv64: Group extend rules * riscv64: Remove more unused rules * riscv64: Cleanup existing extension rules * riscv64: Move the existing Extend rules to ISLE * riscv64: Use `sext.w` when extending * riscv64: Remove duplicate extend tests * riscv64: Use `zbb` instructions when extending values * riscv64: Use `zbkb` extensions when zero extending * riscv64: Enable additional tests for extend i128 * riscv64: Fix formatting for `Inst::Extend` * riscv64: Reverse register for pack * riscv64: Misc Cleanups * riscv64: Cleanup extend rules	2023-02-22 17:41:14 +00:00
Afonso Bordado	6e6a1034d7	riscv64: Add bitmanip extension flags (#5847 )	2023-02-21 22:12:44 +00:00
Alex Crichton	bd3dcd313d	x64: Add more `fma` instruction lowerings (#5846 ) The relaxed-simd proposal for WebAssembly adds a fused-multiply-add operation for `v128` types so I was poking around at Cranelift's existing support for its `fma` instruction. I was also poking around at the x86_64 ISA's offerings for the FMA operation and ended up with this PR that improves the lowering of the `fma` instruction on the x64 backend in a number of ways: * A libcall-based fallback is now provided for `f32x4` and `f64x2` types in preparation for eventual support of the relaxed-simd proposal. These encodings are horribly slow, but it's expected that if FMA semantics must be guaranteed then it's the best that can be done without the `fma` feature. Otherwise it'll be up to producers (e.g. Wasmtime embedders) whether wasm-level FMA operations should be FMA or multiply-then-add. * In addition to the existing `vfmadd213` instructions opcodes were added for `vfmadd132`. The `132` variant is selected based on which argument can have a sinkable load. * Any argument in the `fma` CLIF instruction can now have a `sinkable_load` and it'll generate a single FMA instruction. * All `vfnmadd*` opcodes were added as well. These are pattern-matched where one of the arguments to the CLIF instruction is an `fneg`. I opted to not add a new CLIF instruction here since it seemed like pattern matching was easy enough but I'm also not intimately familiar with the semantics here so if that's the preferred approach I can do that too.	2023-02-21 20:51:22 +00:00
Alex Crichton	d82ebcc102	x64: Enable load-coalescing for SSE/AVX instructions (#5841 ) * x64: Enable load-coalescing for SSE/AVX instructions This commit unlocks the ability to fold loads into operands of SSE and AVX instructions. This is beneficial for both function size when it happens in addition to being able to reduce register pressure. Previously this was not done because most SSE instructions require memory to be aligned. AVX instructions, however, do not have alignment requirements. The solution implemented here is one recommended by Chris which is to add a new `XmmMemAligned` newtype wrapper around `XmmMem`. All SSE instructions are now annotated as requiring an `XmmMemAligned` operand except for a new new instruction styles used specifically for instructions that don't require alignment (e.g. `movdqu`, `sd`, and `ss` instructions). All existing instruction helpers continue to take `XmmMem`, however. This way if an AVX lowering is chosen it can be used as-is. If an SSE lowering is chosen, however, then an automatic conversion from `XmmMem` to `XmmMemAligned` kicks in. This automatic conversion only fails for unaligned addresses in which case a load instruction is emitted and the operand becomes a temporary register instead. A number of prior `Xmm` arguments have now been converted to `XmmMem` as well. One change from this commit is that loading an unaligned operand for an SSE instruction previously would use the "correct type" of load, e.g. `movups` for f32x4 or `movup` for f64x2, but now the loading happens in a context without type information so the `movdqu` instruction is generated. According to [this stack overflow question][question] it looks like modern processors won't penalize this "wrong" choice of type when the operand is then used for f32 or f64 oriented instructions. Finally this commit improves some reuse of logic in the `put_in__mem` helper to share code with `sinkable_load` and avoid duplication. With this in place some various ISLE rules have been updated as well. In the tests it can be seen that AVX-instructions are now automatically load-coalesced and use memory operands in a few cases. [question]: https://stackoverflow.com/questions/40854819/is-there-any-situation-where-using-movdqu-and-movupd-is-better-than-movups * Fix tests * Fix move-and-extend to be unaligned These don't have alignment requirements like other xmm instructions as well. Additionally add some ISA tests to ensure that their output is tested. * Review comments	2023-02-21 19:10:19 +00:00
Alex Crichton	c65de1f1b1	x64: Remove conditional `SseOpcode::uses_src1` (#5842 ) This is a follow-up to comments in #5795 to remove some cruft in the x64 instruction model to ensure that the shape of an `Inst` reflects what's going to happen in regalloc and encoding. This accessor was used to handle `round`, `pextr`, and `pshufb` instructions. The `round` ones had already moved to the appropriate `XmmUnary` variant and `pshufb` was additionally moved over to that variant as well. The `pextr*` instructions got a new `Inst` variant and additionally had their constructors slightly modified to no longer require the type as input. The encoding for these instructions now automatically handles the various type-related operands through a new `SseOpcode::Pextrq` operand to represent 64-bit movements.	2023-02-21 18:17:07 +00:00
Alex Crichton	e6a5ec3fde	x64: Tidy up some handling of sinkable loads (#5840 ) This commit refactors a bit about how sinkable loads are handled in the x64 backend. The intention is to bring most handling around sinkable loads up to date with the current state of the backend since things have changed since these were originally introduced, namely automatic conversions between types in ISLE. For example the `Value` type can be automatically converted to `RegMem` to perform load sinking, but some rules are still explicitly doing matching themselves. Here I've removed explicit handling of immediates and sinkable loads when they're the right-hand-side of an operation. These cases are already handle by the "base case" when converting a `Value` to a `RegMemImm`. Instead only rules explicitly for left-hand-side immediates and sinkable loads remain. This helps cut down on the number of explicit rules needed. Additionally in the same manner that `Value` can be automatically converted to `RegMem` I've added automatic conversions from `SinkableLoad` to `RegMem` and the various other newtypes. This helps cut down a bit on rule verbosity where `sink_load_*` is largely no longer necessary.	2023-02-21 18:15:08 +00:00
Afonso Bordado	0f51338def	riscv64: Clear the top 32bits in the `br_table` index (#5831 ) We were unintentionally relying on these to be zeroed when jumping.	2023-02-21 18:05:51 +00:00
Alex Crichton	c26a65a854	x64: Add most remaining AVX lowerings (#5819 ) * x64: Add most remaining AVX lowerings This commit goes through `inst.isle` and adds a corresponding AVX lowering for most SSE lowerings. I opted to skip instructions where the SSE lowering didn't read/modify a register, such as `roundps`. I think that AVX will benefit these instructions when there's load-merging since AVX doesn't require alignment, but I've deferred that work to a future PR. Otherwise though in this PR I think all (or almost all) of the 3-operand forms of AVX instructions are supported with their SSE counterparts. This should ideally improve codegen slightly by removing register pressure and the need for `movdqa` between registers. I've attempted to ensure that there's at least one codegen test for all the new instructions. As a side note, the recent capstone integration into `precise-output` tests helped me catch a number of encoding bugs much earlier than otherwise, so I've found that incredibly useful in tests! * Move `vpinsr` instructions to their own variant Use true `XmmMem` and `GprMem` types in the instruction as well to get more type-level safety for what goes where. Remove `Inst::produces_const` accessor Instead of conditionally defining regalloc and various other operations instead add dedicated `MInst` variants for operations which are intended to produce a constant to have more clear interactions with regalloc and printing and such. * Fix tests * Register traps in `MachBuffer` for load-folding ops This adds a missing `add_trap` to encoding of VEX instructions with memory operands to ensure that if they cause a segfault that there's appropriate metadata for Wasmtime to understand that the instruction could in fact trap. This fixes a fuzz test case found locally where v8 trapped and Wasmtime didn't catch the signal and crashed the fuzzer.	2023-02-20 15:11:52 +00:00
Afonso Bordado	1e6c94bec1	cranelift-object: Make sections read only by default (#5619 ) This changes the default section type to be `ReadOnlyDataWithRel` instead of `Data`. On COFF types the CRT initializers do not run unless their section is read only. The new SectionKind makes these sections read only for COFF and MachO, but leaves it as Writable as required by ELF.	2023-02-18 12:23:24 +00:00
Afonso Bordado	853ff787f3	fuzzgen: Refactor name and signature generation (#5764 ) * fuzzgen: Move cranelift type generation into CraneliftArbitrary * fuzzgen: Deduplicate DataValue generation * fuzzgen: Remove unused code * fuzzgen: Pass allowed function calls into `FunctionGenerator`	2023-02-17 20:48:12 +00:00
Afonso Bordado	a7bd65d116	fuzzgen: Allow inline stackprobes for riscv64 (#5822 )	2023-02-17 20:47:39 +00:00
Trevor Elliott	a139ed6d56	Fix the postorder traversal in the DominatorTree (#5821 ) Fix the postorder traversal computed by the `DominatorTree`. It was recording nodes in the wrong order depending on the order child nodes were visited. Consider the following program: ``` function %foo2(i8) -> i8 { block0(v0: i8): brif v0, block1, block2 block1: return v0 block2: jump block1 } ``` The postorder produced by the previous implementation was: ``` block2 block1 block0 ``` Which is incorrect, as `block1` is branched to by `block2`. Changing the branch order in the function would also change the postorder result, yielding the expected order with `block1` emitted first. The problem was that when pushing successor nodes onto the stack, the old implementation would also mark them SEEN. This would then prevent them from being pushed on the stack again in the future, which is incorrect as they might be visited by other nodes that have not yet been pushed. This causes nodes to potentially show up later in the postorder traversal than they should. This PR reworks the implementation of `DominatorTree::compute` to produce an order where `block1` is always returned first, regardless of the branch order in the original program. Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2023-02-17 20:39:04 +00:00
Berkus Decker	c8fa1b845f	Fix typo (#5814 )	2023-02-17 15:08:07 +00:00
Alex Crichton	453330b2db	x64: Add rudimentary support for some AVX instructions (#5795 ) * x64: Add rudimentary support for some AVX instructions I was poking around Spidermonkey's wasm backend and saw that the various assembler functions used are all `v`-prefixed which look like they're intended for use with AVX instructions. I looked at Cranelift and it currently doesn't have support for many AVX-based instructions, so I figured I'd take a crack at it! The support added here is a bit of a mishmash when viewed alone, but my general goal was to take a single instruction from the SIMD proposal for WebAssembly and migrate all of its component instructions to AVX. I, by random chance, picked a pretty complicated instruction of `f32x4.min`. This wasm instruction is implemented on x64 with 4 unique SSE instructions and ended up being a pretty good candidate. Further digging about AVX-vs-SSE shows that there should be two major benefits to using AVX over SSE: Primarily AVX instructions largely use a three-operand form where two input registers are operated with and an output register is also specified. This is in contrast to SSE's predominant one-register-is-input-but-also-output pattern. This should help free up the register allocator a bit and additionally remove the need for movement between registers. * As #4767 notes the memory-based operations of VEX-encoded instructions (aka AVX instructions) do not have strict alignment requirements which means we would be able to sink loads and stores into individual instructions instead of having separate instructions. So I set out on my journey to implement the instructions used by `f32x4.min`. The first few were fairly easy. The machinst backends are already of the shape "take these inputs and compute the output" where the x86 requirement of a register being both input and output is postprocessed in. This means that the `inst.isle` creation helpers for SSE instructions were already of the correct form to use AVX. I chose to add new `rule` branches for the instruction creation helpers, for example `x64_andnps`. The new `rule` conditionally only runs if AVX is enabled and emits an AVX instruction instead of an SSE instruction for achieving the same goal. This means that no lowerings of clif instructions were modified, instead just new instructions are being generated. The VEX encoding was previously not heavily used in Cranelift. The only current user are the FMA-style instructions that Cranelift has at this time. These FMA instructions have one extra operand than `vandnps`, for example, so I split the existing `XmmRmRVex` into a few more variants to fit the shape of the instructions that needed generating for `f32x4.min`. This was accompanied then with more AVX opcode definitions, more emission support, etc. Upon implementing all of this it turned out that the test suite was failing on my machine due to the memory-operand encodings of VEX instructions not being supported. I didn't explicitly add those in myself but some preexisting RIP-relative addressing was leaking into the new instructions with existing tests. I opted to go ahead and fill out the memory addressing modes of VEX encoding to get the tests passing again. All-in-all this PR adds new instructions to the x64 backend for a number of AVX instructions, updates 5 existing instruction producers to use AVX instructions conditionally, implements VEX memory operands, and adds some simple tests for the new output of `f32x4.min`. The existing runtest for `f32x.min` caught a few intermediate bugs along the way and I additionally added a plain `target x86_64` to that runtest to ensure that it executes with and without AVX to test the various lowerings. I'll also note that this, and future support, should be well-fuzzed through Wasmtime's fuzzing which may explicitly disable AVX support despite the machine having access to AVX, so non-AVX lowerings should be well-tested into the future. It's also worth mentioning that I am not an AVX or VEX or x64 expert. Implementing the memory operand part for VEX was the hardest part of this PR and while I think it should be good someone else should definitely double-check me. Additionally I haven't added many instructions to the x64 backend yet so I may have missed obvious places to tests or such, so am happy to follow-up with anything to be more thorough if necessary. Finally I should note that this is just the tip of the iceberg when it comes to AVX. My hope is to get some of the idioms sorted out to make it easier for future PRs to add one-off instruction lowerings or such. * Review feedback	2023-02-17 01:29:55 +00:00
Trevor Elliott	d711872d63	Refactor collect_branches_and_targets to not need a smallvec (#5803 ) * Refactor collect_branches_and_targets to not need a smallvec Basic blocks are terminated by at most one branch instruction now, so we can use that assumption in `collect_branches_and_targets` to return the last instruction we saw instead. * Review comments	2023-02-16 21:30:17 +00:00
Chris Fallin	c7e2571866	egraphs: disable GVN of effectful idempotent ops (temporarily). (#5808 ) This is a short-term fix to the same bug that #5800 is addressing (#5796), but with less risk: it simply turns off GVN'ing of effectful but idempotent ops. Because we have an upcoming release, and this is a miscompile (albeit to do with trapping behavior), we would like to make the simplest possible fix that avoids the bug, and backport it. I will then rebase #5800 on top of a revert of this followed by the more complete fix.	2023-02-16 21:29:03 +00:00
Alex Crichton	cae3b26623	x64: Improve codegen for vectors with constant shift amounts (#5797 ) I stumbled across this working on #5795 and figured this was a nice opportunity to improve the codegen here.	2023-02-16 20:47:59 +00:00
Trevor Elliott	80c147d9c0	Rework br_table to use BlockCall (#5731 ) Rework br_table to use BlockCall, allowing us to avoid adding new nodes during ssa construction to hold block arguments. Additionally, many places where we previously matched on InstructionData to extract branch destinations can be replaced with a use of branch_destination or branch_destination_mut.	2023-02-16 09:23:27 -08:00
Chris Fallin	c15c4ed23d	Cranelift: upgrade to regalloc2 0.6.1. (#5799 ) * Cranelift: upgrade to regalloc2 0.6.1. Fixes #5791 by pulling in bytecodealliance/regalloc2#113. * Add cargo-vet entry for regalloc2 0.6.1.	2023-02-16 03:22:58 +00:00
Trevor Elliott	cc073593a4	Fix block label printing in precise-output tests (#5798 ) As a follow-up to #5780, disassemble the regions identified by bb_starts, falling back on disassembling the whole buffer. This ensures that instructions like br_table that introduce a lot of constants don't throw off capstone for the remainder of the function. --------- Co-authored-by: Jamey Sharp <jamey@minilop.net>	2023-02-16 02:35:26 +00:00
Trevor Elliott	f04decc4a1	Use capstone to validate precise-output tests (#5780 ) Use the capstone library to disassemble precise-output tests, in addition to pretty-printing their vcode.	2023-02-15 16:35:10 -08:00
Afonso Bordado	eabd43a178	aarch64: Support GOT Relative relocations in PIC mode (#5550 ) * cranelift: Add `adrp` encoding to AArch64 backend * cranelift: Support GOT Symbol References in AArch64 * cranelift: Add MachO GOT relocations * cranelift: Do not mark the GOT PageOffset12 MachO relocation as relative	2023-02-15 15:19:18 -08:00
Trevor Elliott	aba239e9b8	Fix handling of jumps in bugpoint (#5794 ) Fixes #5792	2023-02-15 15:07:03 -08:00
Afonso Bordado	76539ef9f2	cranelift: Optimize `select+icmp` into `{s,u}{min,max}` (#5546 ) * cranelift: Optimize `select+icmp` into `{s,u}{min,max}` * cranelift: Add generic egraph icmp reverse rule * cranelift: Optimize `vselect+icmp` into `{s,u}{min,max}` * cranelift: Optimize some `vselect+fcmp` into `f{min,max}_pseudo` * cranelift: Add inverted forms of min/max rules	2023-02-15 15:06:21 -08:00
Trevor Elliott	f0137c2618	x64: Fix the formatting for `andn` (#5789 ) * Print AluRmRVex instructions with the destination last * Update andn tests	2023-02-15 11:16:59 -08:00
Alex Crichton	0037b71b11	Use xmm_rm_r more frequently in x64 backend (#5787 ) This updates the signatures of the `xmm_rm_r` helper function and then updates existing users and migrates other users to the helper now that the type information is no longer required.	2023-02-15 09:03:19 -08:00
Ulrich Weigand	305000d14b	s390x: Fix instruction encoding and disassembly format bugs (#5786 ) - Fix encoding of the AHY instruction. - Fix disassembly format of FIEBR, FIDBR, and LEDBRA instructions.	2023-02-15 08:36:44 -08:00
Ulrich Weigand	e10094dcd6	s390x: Support scalar min/max clif instructions (#5762 ) We don't have ISA instructions for that, so simply expand them to icmp + select. Also enable fuzzing for those clif instructions now.	2023-02-15 11:45:09 +00:00
Alphyr	cb150d37ce	Update dependencies (#5513 )	2023-02-14 19:45:15 +00:00
Nick Fitzgerald	6df3bbbe60	Cranelift: Collapse double extends into a single extend (#5772 )	2023-02-13 22:43:17 +00:00
Trevor Elliott	19f337e29b	Move the default block to the front of the underlying jump table storage (#5770 ) The new api on JumpTableData makese it easy to keep the default label first, and that shrinks the diff in #5731 a bit.	2023-02-13 20:50:29 +00:00
Alex Crichton	a0a97f5e8f	Add (bnot (bxor x y)) lowerings for s390x/aarch64 (#5763 ) * Add (bnot (bxor x y)) lowerings for s390x/aarch64 I originally thought that s390x's original lowering in #5709, but as was rightfully pointed out `(bnot (bxor x y))` is equivalent to `(bxor x (bnot y))` so the special lowering for one should apply as a special lowering for the other. For the s390x and aarch64 backend that have already have a fused lowering of the bxor/bnot add a lowering additionally for the bnot/bxor combination. * Add bnot(bxor(..)) tests for s390x 128-bit sizes	2023-02-13 15:41:18 +00:00
Trevor Elliott	d99783fc91	Move default blocks into jump tables (#5756 ) Move the default block off of the br_table instrution, and into the JumpTable that it references.	2023-02-10 08:53:30 -08:00
Trevor Elliott	15fe9c7c93	Inline jump tables in parsed br_table instructions (#5755 ) As jump tables are used by at most one br_table instruction, inline their definition in those instructions instead of requiring them to be declared as function-level metadata.	2023-02-09 14:24:04 -08:00
bjorn3	202d3af16a	Remove the unused sigid argument purpose (#5753 )	2023-02-09 09:18:39 -08:00
Amanieu d'Antras	a2d356d45e	Add `JITBuilder::with_flags` constructor (#5751 ) This allows custom flags to be set (e.g. `opt-level`) while still leaving most of of the boilerplate to select the native target to the `JITBuilder`.	2023-02-09 02:49:17 +00:00

1 2 3 4 5 ...

4450 Commits