Commit Graph

705 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
994af598f5 Avoid interference on CFG edges.
Track allocatable registers both locally and globally: Add a second
AllocatableSet which tracks registers allocated to global values without
accounting for register diversions. Since diversions are only local to
an EBB, global values must be assigned un-diverted locations that don't
interfere.

Handle the third "global" interference domain in the constraint solver in
addition to the existing "input" and "output" domains.

Extend the solver error code to indicate when a global define just can't
be allocated because there are not enough available global registers.
Resolve this problem by replacing the instruction's global defines with
local defines that are copied into their global destinations
afterwards.
2017-10-11 15:38:30 -07:00
Jakob Stoklund Olesen
ba52a38597 Add a t8jccd_long encoding recipe for brz.b1 and brnz.b1 in 32-bit mode.
The register allocator can't handle branches with constrained register
operands, and the brz.b1/brnz.b1 instructions only have the t8jccd_abcd
in 32-bit mode where no REX prefixes are possible.

This adds a worst case encoding for those cases where a b1 value lives
in a non-ABCD register.
2017-10-11 14:20:43 -07:00
Jakob Stoklund Olesen
90ed698e83 Add an unreachable code elimination pass.
The register allocator doesn't even try to compile unreachable EBBs, so
any values defined in such blocks won't be assigned registers.

Since the dominator tree already has determined which EBBs are
reachable, we should just eliminate any unreachable blocks instead o
trying to do something with the dead code.

Not that this is not a "dead code elimination" pass which would also
remove individual instructions whose results are not used.
2017-10-09 15:26:27 -07:00
Dan Gohman
6aeeaebbd3 Disallow branching to the entry block.
Functions that would otherwise start with a loop should start with a
separate ebb which just branches to the header of the loop.
2017-10-09 15:02:17 -07:00
Jakob Stoklund Olesen
b3fa47cacc Add support for emergency spill slots.
- Create a new kind of stack slot: emergency_slot.
- Add a get_emergency_slot() method which finds a suitable emergency
  slot given a list of slots already in use.
- Use emergency spill slots when schedule_moves needs them.
2017-10-06 10:45:13 -07:00
Jakob Stoklund Olesen
73d4bb47c0 Intel encodings for regspill and regfill.
These are always SP-based.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
dda3efcbdd Add regspill and regfill instructions.
These are parallels to the existing regmove instruction, but the divert
the value to and from a stack slot.

Like regmove diversions, this is a temporary diversion that must be
local to the EBB.
2017-10-04 17:02:09 -07:00
Jakob Stoklund Olesen
fb0999ce33 Check the top-level register class for available registers.
Fixes #165.

The constraint solver's schedule_move() function sometimes need to use
an extra available register when the moves to be scheduled contains
cycles.

The pending moves have associated register classes that come from the
constraint programming. Since the moves have hard-coded to and from
registers, these register classes are only meant to indicate the
register sizes. In particular, we can use the whole top-level register
class when scavenging for a spare register to break a cycle.
2017-10-03 14:12:18 -07:00
Jakob Stoklund Olesen
c091a695e6 Fix coalescer bug exposed by the gvn-unremovable-phi test.
When we detect interference between the values that have already been
merged into the candidate virtual register and an EBB argument, we first
try to resolve the conflict by splitting. We also check if the existing
interfering value is fundamentally incompatible with the branch
instruction so it needs to be removed from the virtual register,
restarting the merge operation.

However, this existing interfering value is not necessarily the only
interference, so the split is not guaranteed to resolve the conflict. If
it turns out that splitting didn't resolve the conflict, restart the
merge after removing this second conflicting value.
2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen
ef048b8899 Allow for call args in incoming stack slots.
A value passed as an argument to a function call may live in an incoming
stack slot initially. Fix the call legalizer so it copies such an
argument into the expected outgoing stack slot for the call.
2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen
5f56f81251 Resolve all value aliases when computing live ranges.
Value aliases are only in the way during register allocation, so make
sure they are all dead as we enter the register allocation passes.
2017-09-29 15:54:06 -07:00
Jakob Stoklund Olesen
c82e68efea Eliminate the ABCD register class constaint in REX encodings.
Some REX-less encodings require an ABCD input because they are looking
at 8-bit registers. This constraint doesn't apply with a REX prefix
where the low 8 bits of all registers are addressable.
2017-09-29 15:29:25 -07:00
Jakob Stoklund Olesen
51a6901a7f Implement coloring::iterate_solution().
It can happen that the currently live registers are blocking a smaller
register class completely, so the only way of solving the allocation
problem is to turn some of the live-through registers into solver
variables.

When the quick_solve attempt fails, try to free up registers in the
critical register class by turning live-through values into solver
variables.
2017-09-29 14:55:35 -07:00
Jakob Stoklund Olesen
45888ab84e Reload for spilled call return values.
When the return value from a call has been spilled, the reload pass
needs to insert a spill instruction right after the call instruction
which returns its results in registers.
2017-09-29 11:25:38 -07:00
Jakob Stoklund Olesen
53404a9387 Check for invalid special type constraints.
The extend and reduce instructions have additional type constraints.

Stop inserting sextend instructions after ctz, clz, and popcnt when
translating from WebAssembly. The Cretonne instructions have the same
signature as the WebAssembly equivalents.
2017-09-28 16:30:19 -07:00
Jakob Stoklund Olesen
8abcdac5a1 Legalize fcvt_to_sint and fcvt_to_uint for Intel64.
We need to generate traps on NaN and overflow.
2017-09-28 12:00:38 -07:00
Jakob Stoklund Olesen
34146435e5 Legalize unsigned-to-float conversions for Intel 64.
Also make sure we generate type checks for the controlling type variable
in legalization patterns. This is not needed for encodings since the
encoding tables are already keyed on the controlling type variable.
2017-09-28 11:39:19 -07:00
Jakob Stoklund Olesen
a274cdf275 Fix the Intel encoding of band_not.
The andnps instruction inverts its first argument while band_not inverts
is second argument.

Use a swapped-operands "fax" encoding recipe.
2017-09-27 18:14:13 -07:00
Jakob Stoklund Olesen
84471a8431 Add some very basic support for the Intel32 ABI.
In 32-bit mode, all function arguments are passed on the stack, not in
registers.

This ABI support is not complete or properly tested, but at least it
doesn't try to pass arguments in r8.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
b6b474a8c9 Add Intel legalization for fmin and fmax.
The native x86_fmin and x86_fmax instructions don't behave correctly for
NaN inputs and when comparing +0.0 to -0.0, so we need separate branches
for those cases.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
44eab3e158 Add Intel regmove encodings for floating point types. 2017-09-27 12:49:54 -07:00
Jakob Stoklund Olesen
1fe7890700 Add x86_fmin and x86_fmax instructions.
These Intel-specific instructions represent the semantics of the minss /
maxss Intel instructions which behave more like a C ternary operator
than the WebAssembly fmin and fmax instructions.

They will be used as building blocks for implementing the WebAssembly
semantics.
2017-09-27 09:17:09 -07:00
Jakob Stoklund Olesen
ac69f3bfdf Add an Intel-specific x86_cvtt2si instruction.
This is used to represent the non-trapping semantics of the cvttss2si and
cvttsd2si instructions (and their vectorized counterparts).

The overflow behavior of this instruction is specific to the Intel ISAs.

There is no float-to-i64 instruction on the 32-bit Intel ISA.
2017-09-26 15:44:41 -07:00
Jakob Stoklund Olesen
6ff681a90d Add general legalization for the select instruction. 2017-09-26 14:16:35 -07:00
Jakob Stoklund Olesen
ce767be703 Intel encodings for floating point copies. 2017-09-26 13:54:38 -07:00
Jakob Stoklund Olesen
7fb6159a85 Add Intel encodings for the fcmp instruction.
Not all floating point condition codes are directly supported by the
ucimiss/ucomisd instructions. Some inequalities need to be reversed and
eq+ne require two separate tests.
2017-09-26 11:17:32 -07:00
Jakob Stoklund Olesen
79968a2325 Add standard expansions for fcopysign.
This is also just a sign bit manipulation.
2017-09-25 15:17:32 -07:00
Jakob Stoklund Olesen
6bec5f8507 Intel encodings for nearest/floor/ceil/trunc.
These floating point rounding operations all use the roundss/roundsd
instructions that are available in SSE 4.1.
2017-09-25 15:08:04 -07:00
Jakob Stoklund Olesen
ac343ba92a Add encodings for square root instructions. 2017-09-25 13:15:09 -07:00
Jakob Stoklund Olesen
8deca67968 Add legalization patterns for fabs and fneg.
These sign bit manipulations need to use a -0.0 floating point constant
which we didn't have a way of materializing previously.

Add a ieee32.bits(0x...) syntax to the Python AST nodes that creates am
f32 immediate value with the exact requested bitwise representation.
2017-09-25 12:15:33 -07:00
Jakob Stoklund Olesen
ba1c50d6c1 Test WebAssembly floating point constants.
f64.const does not yet work on 32-bit Intel.
2017-09-25 11:06:18 -07:00
Jakob Stoklund Olesen
fdb97da21b Implement a poor man's jump table.
We will eventually support real jump tables, but for now just expand
br_table into a sequence of conditional branches.
2017-09-25 10:56:14 -07:00
Jakob Stoklund Olesen
29dfcf5dfb Add spill/fill encodings for Intel ISAs.
To begin with, these are catch-all encodings with a SIB byte and a
32-bit displacement, so they can access any stack slot via both the
stack pointer and the frame pointer.

In the future, we will add encodings for 8-bit displacements as well as
EBP-relative references without a SIB byte.
2017-09-22 16:05:26 -07:00
Angus Holder
b003605132 Adapt intel to be able to correctly choose compressed instruction encodings: create a register class to identify the lower 8 registers, omit unnecessary REX prefixes, and fix the tests 2017-09-22 07:54:26 -07:00
Angus Holder
3b66c0be40 Emit compressed instruction encodings for instructions where constraints allow 2017-09-22 07:54:26 -07:00
Jakob Stoklund Olesen
b2a314a229 Add per-instruction source locations to the Cretonne IR.
Source locations are opaque 32-bit entities that can be used to
represent WebAssembly byte-code positions or some other source
identifier.
2017-09-21 14:24:26 -07:00
Jakob Stoklund Olesen
e8723be33f Add trap codes to the Cretonne IL.
The trap and trapz/trapnz instructions now take a trap code immediate
operand which indicates the reason for trapping.
2017-09-20 15:50:02 -07:00
Dan Gohman
ce94a3fa39 Use ScopedHashMap in simple_gvn.
This avoids effectively ending up with most of a function body
stored in the hash map at once by removing elements promptly when
they go out of scope.
2017-09-20 14:21:44 -07:00
Jakob Stoklund Olesen
fb827a2d4b Add func_addr encodings for Intel. 2017-09-19 16:33:38 -07:00
Jakob Stoklund Olesen
d92686d1cd Add a func_addr instruction.
Get the callable address of a function. Use for long distance calls and
for creating arguments to call_indirect in general.
2017-09-19 15:54:02 -07:00
Jakob Stoklund Olesen
1fdeddd0d3 Add Intel encodings for floating point load/store instructions.
Include wasm/*-memory64.cton tests too.
2017-09-19 09:32:54 -07:00
Jakob Stoklund Olesen
88348368a8 Add custom legalization for floating point constants.
Use the simplest expansion which materializes the bits of the floating
point constant as an integer and then bit-casts to the floating point
type. In the future, we may want to use constant pools instead. Either
way, we need custom legalization.

Also add a legalize_monomorphic() function to the Python targetISA class
which permits the configuration of a default legalization action for
monomorphic instructions, just like legalize_type() does for polymorphic
instructions.
2017-09-18 13:33:34 -07:00
Angus Holder
d2273c73ea Make the verifier accept any of the legal encodings of an instruction 2017-09-18 13:00:19 -07:00
Jakob Stoklund Olesen
446fcdd7c5 Fix the REX bits for load/store instruction encodings.
The two registers were swapped in the REX encoding, and the tests didn't
have any high bit set registers.
2017-09-15 13:02:36 -07:00
Jakob Stoklund Olesen
cc3707706c Write and parse value locations for EBB arguments
Fixes #56.

We now have complete support for value location annotations in the
textual IL format. Values defined by instructions as well as EBB
arguments are covered.
2017-09-15 11:21:29 -07:00
Jakob Stoklund Olesen
1349a6bdbc Always require a Flags reference for verifying functions.
Add a settings::FlagsOrIsa struct which represents a flags reference and
optionally the ISA it belongs to. Use this for passing flags/isa
information to the verifier.

The verify_function() and verify_context() functions are now generic so
they accept either a &Flags or a &TargetISa argument.

Fix the return_at_end verifier tests which no longer require an ISA
specified. The signle "set return_at_end" flag setting now makes it to
the verifier even when no ISA is present to carry it.
2017-09-14 17:51:15 -07:00
Jakob Stoklund Olesen
5845f56cda Add x86-64 encodings for call instructions. 2017-09-13 09:34:48 -07:00
Jakob Stoklund Olesen
25af6d380b Add a return_at_end setting.
The flag guarantees that the generated function does not have any
internal return instructions. If the function returns at all, the return
must be the last instruction.

For now just implement a verifier check for this property. When we get
CFG simplifiers and block layout optimizations, they will need to heed
the flag.
2017-09-11 11:09:51 -07:00
Dan Gohman
3532c3533a Teach simple_gvn that iconst.i32 is not congruent to iconst.i64. 2017-08-30 14:33:54 -07:00
Jakob Stoklund Olesen
0deaa616a3 Record identity assignments in regalloc constraint solver.
Fixes #147.

The Solver::reassign_in() method would previously not record fixed
register assignments for values that are already in the correct
register. The register would simply be marked as unavailable for the
solver.

This did have the effect of tripping up the sanity checks in
Solver::add_var() when that method was called with such a "reassigned"
value. The function can be called for a value that already has a fixed
assignment, but the sanity checks want to make sure the variable
constraints are compatible with the existing fixed assignment. When no
such assignment could be found, the method panicked.

To fix this, make sure that even identity reassignments are recorded
in the assignments vector. Instead, filter the identity assignments out
before scheduling a move sequence for the assignments.

Also add some debug tracing to the regalloc solver.
2017-08-29 10:45:33 -07:00