Commit Graph

228 Commits

Author SHA1 Message Date
Chris Fallin
ccd6b4fc2c Remove DefAlloc -- no longer needed. 2022-01-19 23:57:31 -08:00
Chris Fallin
3b037f3c9e Rework checker to not require DefAlloc by tracking all vregs on each alloc.
The symbolic checker currently works by tracking a single symbolic vreg
label for each "alloc" (physical register or stack slot). On definition
of a vreg into an alloc, the label is updated to the new vreg's name.

This worked quite well when the regalloc was simpler, but in the
presence of redundant move elimination, it started to become apparent
that the analysis has a shortcoming: when multiple vregs have the same
value, and the regalloc has deduced this, it can make use of an alloc
that is labeled with one vreg but use it as another vreg and the checker
will not validate the use.

In other words, the regalloc became smart enough to avoid emitting
unnecessary moves, but the checker was relying on those moves to know
the most up-to-date symbolic name for a value in a physical location. In
a sense, a register or stackslot can contain *both* vreg1 and vreg2, and
the regalloc can use it as either.

The stopgap measure of emitting more DefAllocs as part of the redundant
move elimination never quite sat right with me. It works, but it's
asking too much of the regalloc to prove why its moves are correct. We
should rely less on the regalloc and on complex
built-just-for-the-checker plumbing; we should instead improve the
checker so that it can prove on its own that the result is correct.

This PR modifies the checker so that its basic abstraction for an
alloc's value is a *set* of virtual register labels, rather than just
one. The transfer function is a little more complex, but manageable: a
move keeps the old label(s) and adds a new one; redefining a vreg into
one alloc needs to remove that vreg label from all other alloc's sets.

This completely removes the need for metadata from the regalloc (!); all
we need is the original program (pre-alloc, with vregs), the set of
allocations, and the set of inserted moves, and we can validate the
result. This should mean that we trust our checker-validated allocation
results more, and should result in less complexity and maintenance going
forward if we improve the allocator further.
2022-01-19 23:50:35 -08:00
Amanieu d'Antras
6b1a5e8b1b Address review feedback 2022-01-11 22:27:15 +00:00
Amanieu d'Antras
ee4de54240 Guard trace! behind cfg!(debug_assertions)
Even if the trace log level is disabled, the presence of the trace!
macro still has a significant impact on performance because it is
present in the inner loops of the allocator.

Removing the trace! calls at compile-time reduces instruction count by
~7%.
2022-01-11 13:30:13 +00:00
Amanieu d'Antras
2d9d5dd82b Rearrange some struct fields to work better with u64_key/u128_key
This allows the compiler to load the whole key with 1 or 2 64-bit
accesses, assuming little-endian ordering.

Improves instruction count by ~1%.
2022-01-11 13:24:51 +00:00
Amanieu d'Antras
693fb6a975 Only emit DefAlloc edits when the "checker" feature is enabled.
This reduces instruction counts by ~2% when disabled.
2022-01-11 13:03:24 +00:00
Amanieu d'Antras
d95a9d9399 Combine sort keys into u64/u128
This allows the compiler to perform branch-less comparisons, which are
more efficient.

This results in ~5% fewer instructions executed.
2022-01-11 13:03:21 +00:00
Amanieu d'Antras
053375f049 Remove PRegData::reg and use PReg::from_index instead
Performance impact is negligible but this is a good cleanup.
2022-01-11 13:02:08 +00:00
Amanieu d'Antras
74928b83fa Replace all assert! with debug_assert!
This results in a ~6% reduction in instruction count.
2022-01-11 03:54:08 +00:00
Amanieu d'Antras
6f59cd407b Use block_insts_and_edits in the checker 2021-12-27 22:09:07 +01:00
Amanieu d'Antras
8ab44c383e Add a helper to iterate over insts and edits of a block in order 2021-12-27 22:08:36 +01:00
Amanieu d'Antras
51493ab03a Apply review feedback 2021-12-12 00:33:30 +00:00
Amanieu d'Antras
38ffc479c2 Simplify the internal representation of PReg 2021-12-11 22:39:19 +00:00
Amanieu d'Antras
870e4729e1 Add fixed stack slots to the fuzzer 2021-12-11 22:39:19 +00:00
Amanieu d'Antras
8f435243e0 Properly handle fixed stack slots during multi-fixed-reg fixup 2021-12-11 22:39:14 +00:00
Amanieu d'Antras
707aacd818 Split up functions in liverange.rs
This helps with profiling even if they are inlined since perf with DWARF
callgraph profiling can attribute execution time to inlined functions.
2021-12-11 22:31:58 +00:00
Amanieu d'Antras
4f8e115115 Refactor requirement computation 2021-12-11 22:31:58 +00:00
Amanieu d'Antras
77e6a9e0d7 Add support for fixed stack slots
This works by allowing a PReg to be marked as being a stack location
instead of a physical register.
2021-12-11 22:31:58 +00:00
Chris Fallin
ef6c8f3226 Fix fuzzbug: add checker metadata for new vreg on multi-fixed-reg fixup move.
When an instruction uses the same vreg constrained to multiple different
fixed registers, the allocator converts all but one of the fixed
constraints to `Any` and then records a special fixup move that copies
the value to the other fixed registers just before the instruction. This
allows the allocator to maintain the invariant that a value lives in
only one place at a time throughout most of its logic, and constrains
the complexity-fallout of this corner case to just a special last-minute
edit.

Unfortunately some recent CPU time thrown at the fuzzer has uncovered
a subtle interaction with the redundant move eliminator that confuses
the checker.

Specifically, when the correct value is *already* in the second
constrained fixed reg, because of an unrelated other move (e.g. because
of a blockparam or other vreg moved from the original), the redundant
move eliminator can delete the fixup move without telling the checker
that it has done so.

Such an optimization is perfectly valid, and the generated code is
correct; but the checker thinks that some other vreg (the one that was
copied from the original) is in the second preg, and panics.

The fix is to use the mechanism that indicates "this move defines a new
vreg" (emitting a `defalloc` checker-instruction) to force the checker
to understand that after the fixup move, the given preg actually
contains the appropriate vreg.
2021-12-04 23:30:30 -08:00
Amanieu d'Antras
6621a57cb7 Fix liveranges for branch parameters 2021-12-01 01:43:20 +00:00
Amanieu d'Antras
0cb3a8019f Rework the API for outgoing blockparams 2021-12-01 01:43:20 +00:00
Chris Fallin
c53fbb4a5c Fix fuzzbug related to bundle priority ordering.
Changes in computation of bundle priorities during review of the initial
PR introduced a possible mis-ordering of priorities: inner-loop bundle
use weights could exceed the weights of 1_000_000 and 2_000_000 used for
minimal bundles without and with fixed uses (respectively). These two
kinds of minimal bundle are meant to be the highest-priority bundles,
evicting any other bundle they need to, because they can't be split
further. This PR introduces two special bundle weights for these two
kinds of bundles, and clamps all other bundle weights to just below
them.

Thanks to @Amanieu for reporting the issue! Fixes #19.
2021-11-30 15:36:12 -08:00
Chris Fallin
c7bc6c941c Merge pull request #15 from cfallin/relicensing
Relicense fully to Apache-2.0 WITH LLVM-exception.
2021-11-18 12:40:54 -08:00
Amanieu d'Antras
a516e6d6f3 Return safepoint_slots as Allocations instead of SpillSlots
This enables us to support reftype vregs in register locations in the
future.
2021-11-16 00:47:43 +00:00
Amanieu d'Antras
a527a6d25a Remove unused clobbers vector 2021-11-16 00:46:05 +00:00
Chris Fallin
cf0d515709 Relicense fully to Apache-2.0 WITH LLVM-exception.
Large parts of the code in regalloc2 are currently licensed under the
Mozilla Public License (MPL) 2.0, because they derive in meaningful
ways from the register allocator in IonMonkey, which is part of
Firefox. The relevant source files are marked as such, with references
to the files in the Firefox source tree.

The intent of the regalloc2 project was to port the register allocator
from Firefox to use in Cranelift, borrowing good technology and
improving on it in the spirit of open source.

However, Several use-cases of Cranelift require, or at least strongly
prefer, the Apache-2.0 license with the LLVM exception (matching the
license of Cranelift itself, and Bytecode Alliance projects
generally). While using this license is not strictly necessary for
regalloc2 to be usable (The MPL is an excellent open-source license!),
relicensing fully under this license to harmonize with the rest of
Cranelift and Bytecode Alliance codebases significantly widens
possibilities and reduces friction; then regalloc2 is "just another
part of Cranelift" and doesn't have to be treated specially.

The source in `src/ion/` specifically began as a fairly direct port of
the algorithms in the following files in the `mozilla-central`
repository (Firefox codebase):

* The bulk of the "backtracking allocator" algorithm:
  * `js/src/jit/BacktrackingAllocator.{cpp,h}`
* Helpers and definitions in the surrounding infrastructure:
  * `js/src/jit/RegisterAllocator.h`
  * `js/src/jit/RegisterAllocator.cpp`
  * `js/src/jit/StackSlotAllocator.h`
  * `js/src/jit/LIR.h`
* A few data structure implementations:
  * `js/src/ds/SplayTree.h`
  * `js/src/ds/PriorityQueue.h`

Subsequent work in improving regalloc2 has caused it to drift from the
direct port -- for example, it no longer uses splay trees or the
direct port of the priority queue above -- but it is of course very
clearly still a derivative work.

Analysis of the contributors to these files indicates that we need
signoff from the following folks:

* Mozilla Corp, for contributions made by Mozilla employees (the
  majority of the code). Communications with Mozilla (thanks
  @tschneidereit and @bholley for doing the work here!) indicate that
  @ekr is able to sign off when ready here.

* Andy Wingo, specifically for the work done in [Bug
  1620197](https://bugzilla.mozilla.org/show_bug.cgi?id=1620197) and
  [Bug 1609057](https://bugzilla.mozilla.org/show_bug.cgi?id=1609057) to
  generalize the stack allocator for a Wasm feature (multiple returns).

Additionally, since the initial port, we have had three contributions
from @Amanieu:
[#9](https://github.com/bytecodealliance/regalloc2/pull/9),
[#11](https://github.com/bytecodealliance/regalloc2/pull/11),
[#13](https://github.com/bytecodealliance/regalloc2/pull/13).

So, if everyone applicable is happy with this relicensing, this PR
removes the MPL-2.0 license in `src/ion/` and marks all files as
covered under `Apache-2.0 WITH LLVM-exception`. Please let us know if
this is OK!

Signoffs:

- [ ] @ekr, for Mozilla's contributions
- [ ] @wingo, for contributions to original code in `mozilla-central`
- [ ] @Amanieu, for the three PRs linked above

Thanks!
2021-11-10 10:54:28 -08:00
Amanieu d'Antras
358c831b31 Remove regs from MachineEnv
It isn't exactly clear what purpose it serves.
2021-09-19 16:40:27 +01:00
Amanieu d'Antras
af527aca88 Fix PReg indexing with >32 pregs 2021-09-19 16:39:56 +01:00
Amanieu d'Antras
9e2ab3d5f7 Address review feedback 2021-09-14 13:12:52 +01:00
Amanieu d'Antras
35ed2109b1 Adjust Operand encoding
The encoding for OperandConstraint is adjusted to free up 2 bits which
allows for 2^21 vregs and 2^6 pregs.
2021-09-13 08:33:17 +01:00
Chris Fallin
ef2c9b3f26 Merge pull request #11 from Amanieu/requirement
Simplify Requirement by removing register classes
2021-09-09 10:37:17 -07:00
Amanieu d'Antras
448f210e32 Simplify Requirement by removing register classes
We never merge bundles from vregs of different classes, so we don't
need to check for register class conflicts.
2021-09-09 11:16:19 +01:00
Amanieu d'Antras
a243c4e575 Remove Function::is_call
The documentation says that this is only used for heuristics, but it
is never actually called. This should be removed for now and perhaps
added back later if we find an actual use for it.
2021-09-09 11:16:11 +01:00
Chris Fallin
6f0893d69d Address review comments. 2021-08-31 17:56:06 -07:00
Chris Fallin
6389071e09 Address review comments. 2021-08-31 17:42:50 -07:00
Chris Fallin
b19fa4857f Rename operand positions to Early and Late, and make weights f16/f32 values. 2021-08-31 17:31:23 -07:00
Chris Fallin
3a18564e98 Addressed more review comments. 2021-08-30 17:51:55 -07:00
Chris Fallin
6d313f2b56 Address review comments: more doc comments and some minor refactorings. 2021-08-30 17:15:37 -07:00
Chris Fallin
e10bffbca8 Fix bug in refactored BitVec (found by @Amanieu). 2021-08-14 13:40:43 -07:00
Chris Fallin
69ad31f013 Replace remaining instances of use of debug feature with debug_assertions.
Also fix some code that did not build in debug mode anymore (d'oh!) in
`src/ion/merges.rs`, as exposed by this change.
2021-08-12 17:35:55 -07:00
Chris Fallin
8ed83e3a57 Fix BitVec::get_or_insert to scan only once. 2021-08-12 15:40:34 -07:00
Chris Fallin
ffc06b2099 Debug output for Operands: omit default/most common positions. 2021-08-12 14:49:42 -07:00
Chris Fallin
c071e44fc0 Derive PartialOrd/Ord/Hash for Operand. 2021-08-12 14:43:13 -07:00
Chris Fallin
eaf8647fdf BitVec: remove zero words to avoid expanding when unnecessary. 2021-08-12 14:40:18 -07:00
Chris Fallin
82b7e6ba7b Review feedback: bitvec: struct-like enum variants, and factor out one-item cache. 2021-08-12 14:33:35 -07:00
Chris Fallin
7652b4b109 Review feedback. 2021-08-12 14:27:20 -07:00
Chris Fallin
2f856435f4 Review feedback. 2021-08-12 14:08:10 -07:00
Chris Fallin
b76b7747d0 Fix comment in postorder.rs. 2021-08-12 14:00:20 -07:00
Chris Fallin
1f30958b5a Improve domtree as per @Amanieu's feedback. 2021-08-12 12:13:56 -07:00
Chris Fallin
3e1e0f39b6 Convert all log::debug to log::trace. 2021-08-12 12:05:19 -07:00