Commit Graph

  • 940bc40fae Redundant move eliminator. Chris Fallin 2021-06-08 00:08:28 -07:00
  • c6bcd3c941 WIP: redundant-move elimination. Chris Fallin 2021-06-07 21:15:32 -07:00
  • 2be7bdbc22 Split-at-first-conflict: first conflict is first of (start of our range), (start of conflict range), not just the latter; otherwise we have a too-early split sometimes Chris Fallin 2021-06-07 12:27:58 -07:00
  • 0eaa0fde06 Fix to checker: analyze all blocks, even if out-state of entry block is empty Chris Fallin 2021-06-05 14:47:55 -07:00
  • 30f42a8717 Fix fuzzbug: properly detect too-many-live-regs condition on fuzzing input. Must be careful in how we probe the BTree when we have multiple "equal" (overlapping) keys. Chris Fallin 2021-06-03 23:48:33 -07:00
  • 5560499b80 Adaptive commitment-map scanning: re-probe from root if we skip too many entries in linear BTree scan Chris Fallin 2021-06-03 18:17:28 -07:00
  • 00e4240c93 merge bundles much faster by just concatenating range-lists and unstable-sorting, rather than a merge-sort-like traversal. Rust stdlib sort is very optimized. clang.wasm 9.1s -> 6.8s now. Chris Fallin 2021-06-03 17:34:19 -07:00
  • 6a0739b62a Implement spill-bundle: move all empty ranges, and empty leading/trailing pieces surrounding split points, to a single spill bundle, in an attempt to avoid excessive movement Chris Fallin 2021-06-03 00:18:27 -07:00
  • dc2b0d1913 Add a perf idea to TODO list Chris Fallin 2021-06-01 23:13:08 -07:00
  • 2fe276ca04 BTreeMap probe fix (fuzzbug): BTree does not interact nicely with LiveRangeKey definition of equality; need to probe with one-less-than start to get proper range Chris Fallin 2021-06-01 23:10:34 -07:00
  • a2a770ec50 Fuzzbug fix Chris Fallin 2021-06-01 18:57:07 -07:00
  • 2614eac21e fuzzbug fix: restore clean error exit required by regalloc.rs fuzzer on too-many-live-regs error Chris Fallin 2021-06-01 16:31:33 -07:00
  • e49727dc75 Fuzzbug fix: fix some weirdness with BTree iteration inner loop Chris Fallin 2021-06-01 15:32:12 -07:00
  • 44ca1893c3 Fuzzbug fix: properly check for conflicting reqs before merging bundles (cached values are not computed yet) Chris Fallin 2021-06-01 14:52:59 -07:00
  • f49167e0fe emit annotations at Info level, for easier selective perf-debugging Chris Fallin 2021-05-28 18:35:09 -07:00
  • 789651f947 Rework inner allocation-loop code: choose more wisely between splitting and evicting based on costs (and unify the fixed-reg-constraint case) Chris Fallin 2021-05-28 17:36:06 -07:00
  • 43d7095cbd Properly split when we hit a fixed conflict Chris Fallin 2021-05-28 16:49:32 -07:00
  • 7171624750 Don't generate r1->scratch,scratch-r1 sequence for cyclic moves of r1->r1 that are generated to change vreg ownership and keep the checker happy. Seems to eliminate a bit of braindeadness and improve bz2 by ~5-10%. Chris Fallin 2021-05-26 21:35:43 -07:00
  • 13bde99d7d bugfix with clean-spill opt: avoid if liverange starts at start of block (this is like a def) or if has starts-at-def flag. Chris Fallin 2021-05-26 18:08:41 -07:00
  • e521811b88 Avoid re-spilling to spillslot when still clean: intra-block edition (inter-block needs more analysis and careful thought) Chris Fallin 2021-05-26 17:08:14 -07:00
  • dcf6f473ca inline some things Chris Fallin 2021-05-26 00:48:41 -07:00
  • 4e0dd1f296 little tweak to avoid a div/mod on every iter of a PReg alloc loop Chris Fallin 2021-05-26 00:38:53 -07:00
  • b3dc2b25a5 Alloc spillsets for whole vreg, not just spilled LRs. This is a prerequisite to allowing a "clean" value to remain in spillslot while also in reg and avoiding the re-spill. It should also reduce stack-to-stack moves (though they can still come from progmoves). Chris Fallin 2021-05-25 18:19:25 -07:00
  • ca5f24f6b7 Hint the same PReg for both halves of a split Chris Fallin 2021-05-24 23:49:47 -07:00
  • 7cdcb2031e Split heuristic: split before entering deeper loop nest Chris Fallin 2021-05-24 23:09:05 -07:00
  • 8887077b59 small fix: preserve starts-at-def flag when setting liverange weight Chris Fallin 2021-05-24 22:45:25 -07:00
  • 3382f9a2e8 Split based on first conflict of lowest-weight conflict, not first conflict. Also stop scanning PRegs when max bundle weight in conflict bundle list exceeds current best option. Chris Fallin 2021-05-24 22:26:57 -07:00
  • 5b47462e0c Loop depth instead of hot/cold, with fast O(n) loop-depth computation. Use this to compute use weights. Chris Fallin 2021-05-24 22:09:41 -07:00
  • 5895ae8a2d Remove precomputed requirements from ranges and bundles; cost of struct size and updates is not worth it. Instead, do a simple conflicting-requirements check on each attempted merge, but minimize cost by (i) only checking after ruling out overlaps, and (ii) only checking if we know one of the sides has at least one non-register constraint. Chris Fallin 2021-05-24 21:32:41 -07:00
  • 5120681730 Fuzzbug fix for requirement recomputation on minimal bundles with multiple LRs Chris Fallin 2021-05-24 17:49:45 -07:00
  • 10d926557a avoid some redundant work by computing initial reqs only once Chris Fallin 2021-05-24 15:52:14 -07:00
  • 78c009181c Fuzzbug fix re: new requirements computation and multi-fixed-reg fixup. Chris Fallin 2021-05-24 15:47:15 -07:00
  • 46feacc654 Fuzzbug fix: don't merge bundles that have conflicting requirements. (Normally splitting would fix this, but let's just not merge in the first place.) Chris Fallin 2021-05-24 15:32:05 -07:00
  • 59967ff756 TODO-list update: braindump of next ideas to work on. Chris Fallin 2021-05-23 20:04:28 -07:00
  • 107c09181f Simple speedup in bundle merge: set bundle while everything is in cache (same pass), and only check non-overlap with debug assertions enabled. Alloc time on clang.wasm: 9.1s old backtracking RA vs. 7.2s with this regalloc2 RA. Chris Fallin 2021-05-22 16:36:44 -07:00
  • a6c89b1c01 Avoid O(n^2) in liverange construction: we always build LRs in (reverse) order, so we can just append (prepend) to running list and reverse at end. Likewise for uses. Chris Fallin 2021-05-22 15:12:35 -07:00
  • 469669155f Another fuzzbug fix: proper checker-hint ordering when V-R and V-V moves are back-to-back and RReg ownership changes Chris Fallin 2021-05-22 14:37:56 -07:00
  • 4b46c6388a Fuzzbug fixes for simpler splitting Chris Fallin 2021-05-22 14:26:49 -07:00
  • 466ea2cd9a Simpler / possibly better splitting: split based on conflict position, always, and use a reg hint to put the before-conflict part in the place where we determined it fit before. Chris Fallin 2021-05-21 01:34:52 -07:00
  • ec7fdeb8ed Properly handle RReg-RReg moves in new scheme Chris Fallin 2021-05-20 20:38:50 -07:00
  • 2a5f571b80 WIP: Handle moves between realregs (pregs) and vregs somewhat specially, by converting into operand constraints Chris Fallin 2021-05-20 19:53:16 -07:00
  • f0b24cf9fa Remove all-empty-ranges-to-spill-bundle: prioritizing same-alloc for all empty ranges over allowing some to live in registers results in too much spilling Chris Fallin 2021-05-20 10:21:22 -07:00
  • ce935c1040 Add all empty LRs to a single "spill bundle", to avoid many small bundles and excessive moves Chris Fallin 2021-05-19 22:12:22 -07:00
  • f56676fb8d Fixed all fuzzer targets (some API changes) Chris Fallin 2021-05-19 18:25:34 -07:00
  • f1c6dfe807 Optionally show annotations in final allocation/program dump based on RegallocOptions flag Chris Fallin 2021-05-19 16:36:36 -07:00
  • e1f67e860f Pinned VRegs for use with regalloc.rs shim to support RealRegs. Chris Fallin 2021-05-18 22:40:43 -07:00
  • 04c8e46787 Only do annotations in debug builds Chris Fallin 2021-05-18 18:52:34 -07:00
  • c3513b94b0 Bugfix: don't do a split-at-intermediate-defs split if the first such point is the start of the bundle. Chris Fallin 2021-05-18 15:16:19 -07:00
  • 4389f16156 debugging log message for liveins Chris Fallin 2021-05-18 12:14:59 -07:00
  • 8e0d0f1de0 fuzzbug fix Chris Fallin 2021-05-18 00:26:38 -07:00
  • 328c9004e5 fuzzbug fixes Chris Fallin 2021-05-17 23:59:13 -07:00
  • f0fbf3aa0c Rework data structures: bundles have a SmallVec of ranges, and ranges a SmallVec of uses. Chris Fallin 2021-05-17 22:19:47 -07:00
  • 5b55948feb Check branch-args for conflicts with edge-move placement. Chris Fallin 2021-05-13 17:25:11 -07:00
  • 1f9258bea5 Detect undefined liveins. Chris Fallin 2021-05-12 01:06:27 -07:00
  • 37fa3ec763 Improve prog-move handling: no use/def records, just directly connect the LRs. Chris Fallin 2021-05-11 23:59:12 -07:00
  • 6066d02f6f More annotations Chris Fallin 2021-05-11 18:19:40 -07:00
  • b069ae099d Use hot-code map to augment spill weights of each use Chris Fallin 2021-05-11 17:59:10 -07:00
  • e1a37cf0e0 some more stats Chris Fallin 2021-05-10 22:53:44 -07:00
  • f7551c68d1 Integrate prog-moves with LR-moves; this should in theory reduce move traffic somewhat Chris Fallin 2021-05-10 22:47:57 -07:00
  • 0dbf4a790f Collect full conflict-bundle list, by not ending PhysReg probe on first conflict; this leads to better eviction decisions on bz2 Chris Fallin 2021-05-09 20:21:57 -07:00
  • b7fd53efc5 Fix checker: after moving edge-moves to prior to last branch of block (for simpler semantics for library user), we can no longer check blockparams; but this is fine because they do not exist in post-regalloc code. Chris Fallin 2021-05-09 19:38:20 -07:00
  • 4f26b1c78f Properly handle prog-moves with fixed srcs or dests Chris Fallin 2021-05-09 13:35:38 -07:00
  • 5c5ea4cb9b bugfix Chris Fallin 2021-05-09 04:11:30 -07:00
  • 34421fcc6b fix to prog-move handling: happens in middle of inst; and insert uses to make later move-insertion happy with this Chris Fallin 2021-05-09 03:51:10 -07:00
  • 8d7530d3fa Edge moves always before jumps, never after; semantics are too subtle otherwise (client needs to handle specially) Chris Fallin 2021-05-09 02:20:38 -07:00
  • c380b0d979 assert fix: RegClass doesn't need to match for spillslots (can be reused across classes) Chris Fallin 2021-05-09 01:51:00 -07:00
  • 9fdc69edde fuzzbug fix in range-summary iter Chris Fallin 2021-05-09 01:48:16 -07:00
  • 509c5dc2fd Remove sanity-check logic in range summary construction -- zero-length ranges make this somewhat fickle to verify, and fuzzing will catch any issues. Chris Fallin 2021-05-09 01:14:03 -07:00
  • 095a883814 Fix crit-edge detection logic in CFGInfo analysis Chris Fallin 2021-05-09 01:06:59 -07:00
  • b9e89885c4 Error checking: properly signal a crit-edge requirement failure (used for regalloc.rs fuzzer) Chris Fallin 2021-05-08 21:48:58 -07:00
  • f1fc9a8f7e Fix related to move handling Chris Fallin 2021-05-08 19:04:16 -07:00
  • 00c64f680a Handle moves by joining LRs at inst boundary, not middle of move inst Chris Fallin 2021-05-08 17:45:24 -07:00
  • 3d0d760c70 Bugfix: process parallel moves separately for Int and Float classes Chris Fallin 2021-05-08 16:16:30 -07:00
  • 179ef0e534 Bugfix: Mod with dead def spans both Before and After positions Chris Fallin 2021-05-08 15:47:38 -07:00
  • ed339ab4d8 Some minor opts: inlining, and smallvec reuse Chris Fallin 2021-05-07 20:54:27 -07:00
  • e41b0101a8 Struct-of-array transform: pull LiveRangeHot out of LiveRange with just range and next-in-bundle link Chris Fallin 2021-05-07 20:41:33 -07:00
  • 4185eab441 More efficient live-range creation re: uses Chris Fallin 2021-05-07 20:12:40 -07:00
  • a6e3128821 Support mod (modify) operands, for better efficiency with regalloc.rs/Cranelift shim. Chris Fallin 2021-05-07 19:48:34 -07:00
  • d2cc4f1ac2 More efficient queue_bundles (saves 18% on clang.wasm) Chris Fallin 2021-05-07 19:20:28 -07:00
  • 040c3c838c Some structure packing: Use now fits in three u32s. Chris Fallin 2021-05-07 19:05:20 -07:00
  • bfe1c632c9 Some preallocation and removal of one u32 from LiveRange struct Chris Fallin 2021-05-07 18:52:46 -07:00
  • a453501ebb sort_unstable (quicksort) everywhere Chris Fallin 2021-05-07 18:17:13 -07:00
  • df59b5ede4 Inline all the things (ProgPoint edition) Chris Fallin 2021-05-07 17:55:04 -07:00
  • 4f6346768e Pinned-VReg mechanism. Chris Fallin 2021-05-07 17:45:51 -07:00
  • 3ddcf05fea Optimizations: (i) range-summary array; (ii) early exit from btree probe loop (one conflict bundle is enough, empirically) Chris Fallin 2021-05-07 17:03:44 -07:00
  • 0f3454b4d7 Inlining on btree commitment map comparators for a 10% win Chris Fallin 2021-05-07 01:51:40 -07:00
  • 3713d6131e Replace approximate liveness with true iterative liveness; turns out it is better to improve accuracy so that later stages of the allocator have less wasted work/interference Chris Fallin 2021-05-07 01:22:12 -07:00
  • 42582e0c6f Some stats for loop effects on liveins: 487k loop set-unions (441 loops) in one func in bz2 -- fix TBD Chris Fallin 2021-05-07 00:19:41 -07:00
  • 2ba518517d Fuzzbugfix: actually do need eager liveness computation; must uphold invariant that all earlier-in-postorder blocks have full livein sets. Chris Fallin 2021-05-06 23:29:59 -07:00
  • 2ff02b50a3 Some perf opts in liveness computation and set impl: Chris Fallin 2021-05-06 22:46:16 -07:00
  • a148dccac3 Parameterize adaptive-map size in BitVec. Chris Fallin 2021-05-06 22:02:10 -07:00
  • 02b6516acd Some memory-size/bitpacking optimizations Chris Fallin 2021-05-06 20:47:17 -07:00
  • 07a5a88972 BitVec perf: use adaptive hybrid chunked small-array + FxHashMap. Chris Fallin 2021-05-06 20:03:44 -07:00
  • e2beb471c4 Handle moves specially with move-insertion logic rather than ordinary operand/inst handling Chris Fallin 2021-05-06 18:49:23 -07:00
  • 747c56c2c3 Some micro-optimizations in BitVec. Chris Fallin 2021-05-06 16:19:38 -07:00
  • 1a7a0c5e3d Some performance tweaks -- try to reduce register probe count with soem more hints. Also fix spill-weight caching. Chris Fallin 2021-05-06 16:09:39 -07:00
  • 80cdd0c5ac Properly handle multiple same-fixed-reg constraints to the same vreg in one inst. Chris Fallin 2021-05-06 01:01:27 -07:00
  • ab828b6c86 MachineEnv fields are public Chris Fallin 2021-05-05 23:14:04 -07:00
  • 48fbc235ea BitVec::get() takes immutable self Chris Fallin 2021-05-05 23:08:19 -07:00
  • 15ed2d6522 Allow multiple defs per vreg (i.e., accept non-SSA code). Chris Fallin 2021-05-05 22:36:41 -07:00