regalloc2

Author	SHA1	Message	Date
T0b1	e2061d2e04	first impl	2023-04-14 18:18:15 +02:00
Trevor Elliott	7f0d0b98d0	Expose ssa verification as a regalloc2 option (#102 ) Adds the validate_ssa flag to the RegallocOptions struct, enabling ssa validation of inputs before register allocation takes place.	2022-11-29 09:30:59 -08:00
Chris Fallin	b4eedf3f32	Fix the `moves` fuzz target. (#97 ) * Fix iteration bug in move fuzz-target driver. * Remove panic no longer needed: stack-to-stack moves are now properly handled.	2022-10-21 09:08:27 -07:00
Alex Crichton	d6b15095c5	Add explicit dependency on `libfuzzer-sys` (#96 ) This fixes a build issue with the most recent version of `libfuzzer-sys` where the prior version of the `fuzz_target!` macro didn't need the crate as an explicit dependency but the latest version does.	2022-10-21 08:33:50 -07:00
Amanieu d'Antras	906a053208	Remove register class from `SpillSlot` (#80 ) * Remove register class from `SpillSlot` The register allocator was already allowing moves between spillslots and registers of different classes, so this PR formalizes this by making spillslots independent of register class. This also fixes #79 by properly tracking the register class of an `InsertedMove` with the `to_vreg` field which turns out to never be `None` in practice. Removing the `Option` allows the register class of the `VReg` to be used when building the per-class move lists. Fixes #79 * Address review feedback	2022-09-20 14:05:23 -07:00
Amanieu d'Antras	520cafa129	Handle fixed stack slots in the move resolver (#78 ) Fixed stack slots are treated as `PReg`s by most of the register allocator, but need some additional handling the move resolver to avoid generating stack-to-stack moves.	2022-09-19 12:27:24 -07:00
Amanieu d'Antras	1495c1e342	Add fixed-non-allocatable operand support (#77 ) This allows a non-allocatable `PReg` to be passed on directly to the allocations vector without any liverange tracking from the register allocator. The main intended use case is to support ISA-specific special registers such as a fixed zero register.	2022-09-19 12:23:53 -07:00
Jamey Sharp	47cb8234b6	Remove old tests and reexport libfuzzer from fuzzing module (#69 ) * Remove unused regalloc2-test crate This code doesn't build, and Chris says it's "a really old harness that existed prior to building the fuzzing and was used mainly to profile and get stats before integration with Cranelift". * Re-export libfuzzer/arbitrary from fuzzing module This avoids needing to keep dependencies on `arbitrary` in sync across the three different Cargo.toml files in this project. However, before version 0.4.2, libfuzzer-sys only supported using its macros if it was available at the top-level `libfuzzer_sys` path, which breaks when re-exporting it. So I'm upgrading to that version (or the newest patch release of it). Upgrading libfuzzer-sys in turn brings in the 1.0 release of the arbitrary crate, with a minor API change along the way.	2022-08-03 14:28:37 -07:00
Chris Fallin	869c21e79c	Remove an explicitly-set-aside scratch register per class. (#51 ) Currently, regalloc2 sets aside one register per class, unconditionally, to make move resolution possible. To solve the "parallel moves problem", we sometimes need to conjure a cyclic permutation of data among registers or stack slots (this can result, for example, from blockparam flow that swaps two values on a loop backedge). This set-aside scratch register is used when a cycle exists. regalloc2 also uses the scratch register when needed to break down a stack-to-stack move (which could happen due to blockparam moves on edges when source and destination are both spilled) into a stack-to-reg move followed by reg-to-stack, because most machines have loads and stores but not memory-to-memory moves. A set-aside register is certainly the simplest solution, but it is not optimal: it means that we have one fewer register available for use by the program, and this can be costly especially on machines with fewer registers (e.g., 16 GPRs/XMMs on x86-64) and especially when some registers may be set aside by our embedder for other purposes too. Every register we can reclaim is some nontrivial performance in large function bodies! This PR removes this restriction and allows regalloc2 to use all available physical registers. It then solves the two problems above, cyclic moves and stack-to-stack moves, with a two-stage approach: - First, it finds a location to use to resolve cycles, if any exist. If a register is unallocated at the location of the move, we can use it. Often we get lucky and this is the case. Otherwise, we allocate a stackslot to use as the temp. This is perfectly fine at this stage, even if it means that we have more stack-to-stack moves. - Then, it resolves stack-to-stack moves into stack-to-reg / reg-to-stack. There are two subcases here. If there is another available free physical register, we opportunistically use it for this decomposition. If not, we fall back to our last-ditch option: we pick a victim register of the appropriate class, we allocate another temporary stackslot, we spill the victim to that slot just for this move, we do the move in the above way (stack-to-reg / reg-to-stack) with the victim, then we reload the victim. So one move (original stack-to-stack) becomes four moves, but no state is clobbered. This PR extends the `moves` fuzz-target to exercise this functionality as well, randomly choosing for some spare registers to exist or not, and randomly generating {stack,reg}-to-{stack,reg} moves in the initial parallel-move input set. The target does a simple symbolic simulation of the sequential move sequence and ensures that the final state is equivalent to the parallel-move semantics. I fuzzed both the `moves` target, focusing on the new logic; as well as the `ion_checker` target, checking the whole register allocator, and both seem clean (~150M cases on the former, ~1M cases on the latter).	2022-05-23 10:48:37 -07:00
Chris Fallin	a5c48fda8a	Support program moves, including pinned vregs, in the checker. (#43 ) The checker was built to validate programs produced by the fuzzing testcase generator, which was built before regalloc2 supported special handling of moves. (In a pure-SSA world, move elision is not needed, because moves are not needed, and blockparams are the only way of tying together vregs.) Due to this, the checker works great for our independent regalloc2 fuzzing setup, but when used on regalloc inputs produced by Cranelift, cannot prove correctness. This PR extends the checker's analysis to properly handle "program moves", which are distinct from regalloc-inserted moves in that they are present in the original program and hence are semantically relevant. A program move edits all sets of symbolic vregs at all allocs, and where the source vreg appears, it inserts the dest vreg as well. (It also removes the dest vreg from all other sets, since the old value becomes stale, as is done for other defs.) Given this, and given some additional checking for moves to/from pinned vregs, the checker can now be used to fully validate Cranelift-sourced regalloc2 invocations.	2022-04-18 10:36:26 -07:00
Amanieu d'Antras	be61078e4e	Format Cargo.toml	2022-01-11 13:34:50 +00:00
Amanieu d'Antras	74928b83fa	Replace all assert! with debug_assert! This results in a ~6% reduction in instruction count.	2022-01-11 03:54:08 +00:00
Chris Fallin	1f30958b5a	Improve domtree as per @Amanieu's feedback.	2021-08-12 12:13:56 -07:00
Chris Fallin	3e1e0f39b6	Convert all log::debug to log::trace.	2021-08-12 12:05:19 -07:00
Chris Fallin	50eb6fc42f	Keep internal modules private, but re-export under fuzzing feature flag	2021-06-19 12:08:37 -07:00
Chris Fallin	f56676fb8d	Fixed all fuzzer targets (some API changes)	2021-05-19 18:25:34 -07:00
Chris Fallin	37fa3ec763	Improve prog-move handling: no use/def records, just directly connect the LRs. Also requires some metadata in edit output to properly hook up the checker in regalloc.rs to track user-moves without seeing the original insts with operands.	2021-05-11 23:59:12 -07:00
Chris Fallin	414f3f828d	Factored out test program and fuzzing features; core crate now only depends on smallvec and log.	2021-04-18 14:19:32 -07:00
Chris Fallin	34ab744f4f	Add GitHub CI config.	2021-04-18 13:18:18 -07:00
Chris Fallin	940c1b719d	Changes from review comments.	2021-04-18 13:12:33 -07:00
Chris Fallin	a08b0121a0	Add support for reftypes/stackmaps and Stack constraints, and misc API changes. The main enhancement in this commit is support for reference types and stackmaps. This requires tracking whether each VReg is a "reference" or "pointer". At certain instructions designated as "safepoints", the regalloc will (i) ensure that all references are in spillslots rather than in registers, and (ii) provide a list of exactly which spillslots have live references at that program point. This can be used by, e.g., a GC to trace and possibly modify pointers. The stackmap of spillslots is precise: it includes all live references, and only live references. This commit also brings in some API tweaks as part of the in-progress Cranelift glue. In particular, it makes Allocations and Operands mutually disjoint by using the same bitfield for the type-tag in both and choosing non-overlapping tags. This will allow instructions to carry an Operand for each register slot and then overwrite these in place with Allocations. The `OperandOrAllocation` type does the necessary magic to make this look like an enum, but staying in 32 bits.	2021-04-17 21:29:13 -07:00
Chris Fallin	8e923b0ad9	Initial public commit of regalloc2.	2021-04-13 17:40:12 -07:00

22 Commits