This makes it more consistent with how all the rest of the content of
a wasm module is handled. And, now TranslationResult just has a Vec
of translated functions, which will make it easier to refactor further.
The register allocator doesn't even try to compile unreachable EBBs, so
any values defined in such blocks won't be assigned registers.
Since the dominator tree already has determined which EBBs are
reachable, we should just eliminate any unreachable blocks instead o
trying to do something with the dead code.
Not that this is not a "dead code elimination" pass which would also
remove individual instructions whose results are not used.
During iterate_solution(), live-through values may be converted to
solver variables so they can be moved out of the way in order to satisfy
all constraints. Make sure that the instruction's operand constraints
are also considered for these new variables.
Add a program_complete_input_constraints() which turns all the
instruction's input operands into variables with the proper constraints.
That makes it safe for try_add_var() to re-add these values as variables
with looser generic constraints.
The solver's add_var() function is split into three functions: add_var
for use before inputs_done(), and add_killed_var/add_through_var for use
after.
Most recipes with an ABCD constraint can handle the full GPR register
class when a REX prefix is applied, but not all. The "icscc" macro
recipe always generates a setCC instruction with no REX prefix, so it
can only write the ABCD registers, even in its REX form.
Don't automatically rewrite ABCD constraints to GPR constraints when
applying a REX prefix to a tail recipe. Instead, allow individual ABCD
recipes to specify a "when_prefixed" alternative recipe to use. This
also eliminates the spurious Rex*abcd recipe names which didn't have an
ABCD constraint.
Also allow recipes to specify that a REX prefix is required by setting
the prefix_required flag. This is used by recipes like t8jccb which
explicitly accesses an 8-bit register with a GPR constraint which is
only valid with a prefix.
When solver variables represent operands on the current instruction,
they need to be constrained as required by the instructions, but
variables that are simply moved out of the way should only be
constrained to their top-level register class. The live range affinity
is just a hint, not a requirement.
When try_add_var is looking for values that can be moved out of the way
in order to satisfy constraints for the current instruction, avoid
values that are live on a CFG edge originating at the current (branch)
instruction.
These values must be in their globally assigned location when entering
the branch destination EBB.
This is covered by the existing regalloc/iterate.cton test case which
fails with an upcoming commit.
- Create a new kind of stack slot: emergency_slot.
- Add a get_emergency_slot() method which finds a suitable emergency
slot given a list of slots already in use.
- Use emergency spill slots when schedule_moves needs them.
This method was important back when result values couldn't be moved
between instructions. Now that results can be moved, value aliases do
everything we need.
Copy instructions are still used to break interferences in the register
allocator's coalescing phase, but there isn't really any reason to use a
copy instruction over a value alias anywhere else.
After and during register allocation, copy instructions are significant,
so we never want to "see through" them like the resolve_copies()
function did.
This is related to #166, but probably doesn't fix the underlying
problem.
This is a verification pass that can be run after register allocation.
It verifies that value locations are consistent with constraints on
their uses, and that the register diversions are consistent.
Make it clear that register diversions are local to an EBB only. This
affects what branch relaxation is allowed to do.
The verify_locations() takes an optional Liveness parameter which is
used to check that no diverted values are live across CFG edges.
When "binemit" tests encode instructions, keep track of the current set
of register diversions, and use the diverted locations to check operand
constraints.
This matches how constraints are applied during a real binemit phase.
These are parallels to the existing regmove instruction, but the divert
the value to and from a stack slot.
Like regmove diversions, this is a temporary diversion that must be
local to the EBB.
The register constraint solver schedules a set of move instructions to
execute before the instruction getting colored. In extreme cases, this
is not possible because there are no available registers to break cycles
in the register assignments that must be scheduled.
When that happens, we spill one register to an emergency slot so it
becomes available for implementing the assignment cycle. Then the
original register is restored.
The coloring pass can't yet understand the spill and fill move types.
This will be implemented next.
This makes it possible to materialize new RegClass references without
requiring a RegInfo reference to be passed around.
- Move the RegInfo::toprc() method to RegClassData.
- Rename RegClassData::intersect() to intersect_index() and provide a
new intersect() which returns a register class.
- Remove some &RegInfo parameters that are no longer needed.
Fixes#165.
The constraint solver's schedule_move() function sometimes need to use
an extra available register when the moves to be scheduled contains
cycles.
The pending moves have associated register classes that come from the
constraint programming. Since the moves have hard-coded to and from
registers, these register classes are only meant to indicate the
register sizes. In particular, we can use the whole top-level register
class when scavenging for a spare register to break a cycle.
The controlling type variable passed to the format constructor in the
InstBuilder trait is not just used to generate the result values. In an
EncCursor, it is also used to encode the instruction, so VOID doesn't
work.
When we detect interference between the values that have already been
merged into the candidate virtual register and an EBB argument, we first
try to resolve the conflict by splitting. We also check if the existing
interfering value is fundamentally incompatible with the branch
instruction so it needs to be removed from the virtual register,
restarting the merge operation.
However, this existing interfering value is not necessarily the only
interference, so the split is not guaranteed to resolve the conflict. If
it turns out that splitting didn't resolve the conflict, restart the
merge after removing this second conflicting value.
A value passed as an argument to a function call may live in an incoming
stack slot initially. Fix the call legalizer so it copies such an
argument into the expected outgoing stack slot for the call.
This removes the `optimize` option, as one can do that with
`--set`, eg. `--set opt_level=best`. And it adds an option to
print the compilation output.
And, it enables simple_gvn and licm for opt_level=best.
This removes the `optimize` option, as one can do that with
`--set`, eg. `--set opt_level=best`. And it adds an option to
print the compilation output.
Some REX-less encodings require an ABCD input because they are looking
at 8-bit registers. This constraint doesn't apply with a REX prefix
where the low 8 bits of all registers are addressable.
It can happen that the currently live registers are blocking a smaller
register class completely, so the only way of solving the allocation
problem is to turn some of the live-through registers into solver
variables.
When the quick_solve attempt fails, try to free up registers in the
critical register class by turning live-through values into solver
variables.
The brz and brnz instructions get support for 32-bit jump displacements
for long range branches.
Also change the way branch ranges are specified on tail recipes for the
Intel instructions. All branch displacements are relative to the end of
the instruction, so just compute the branch range origin as the
instruction size instead of trying to specify it in the tail recipe
definitions.
When the return value from a call has been spilled, the reload pass
needs to insert a spill instruction right after the call instruction
which returns its results in registers.
The x86_divmodx traps on integer overflow, but the srem instruction is
not supposed to trap with a -1 divisor.
Generate a legalization expansion for srem that special-cases the -1
divisor to simply return 0.