Commit Graph

65 Commits

Author SHA1 Message Date
Dan Gohman
4e67e08efd Use the target-lexicon crate.
This switches from a custom list of architectures to use the
target-lexicon crate.

 - "set is_64bit=1; isa x86" is replaced with "target x86_64", and
   similar for other architectures, and the `is_64bit` flag is removed
   entirely.

 - The `is_compressed` flag is removed too; it's no longer being used to
   control REX prefixes on x86-64, ARM and Thumb are separate
   architectures in target-lexicon, and we can figure out how to
   select RISC-V compressed encodings when we're ready.
2018-05-30 06:13:35 -07:00
Dan Gohman
3b1d805758 Stack overflow checking with stack probes.
This adds a libcall name, a calling convention, and settings for
emitting stack probes, and implements them for x86 system_v ABIs.
2018-04-22 21:52:12 -07:00
Dan Gohman
c5b15c2396 Refactor calling convention settings. (#304)
Add a calling-convention setting to the `Flags` used as part of the
`TargetIsa`. This allows Cretonne code that generates calls to use the
correct convention, such as when emitting libcalls during legalization
or when the wasm frontend is decoding functions. This setting can be
overridden per-function.

This also adds "fast", "cold", and "fastcall" conventions, with "fast"
as the new default. Note that "fast" and "cold" are not intended to be
ABI-compatible across Cretonne versions.

This will also ensure Windows users will get an `unimplemented!` rather
than silent calling-convention mismatches, which reflects the fact that
Windows calling conventions are not yet implemented.

This also renames SpiderWASM, which isn't camel-case, to Baldrdash,
which is, and which is also a more relevant name.
2018-04-22 21:35:18 -07:00
Dan Gohman
f43b6aca1a Use lower-case letters for github URLs.
This makes it a little more consistent; now, "cretonne" is never capitalized
in identifier, path, or URL contexts. It is capitalized in natural
language contexts when referring to the project.
2018-04-17 09:47:11 -07:00
Dan Gohman
24fa169e1f Rename the 'cretonne' crate to 'cretonne-codegen'.
This fixes the next part of #287.
2018-04-17 09:46:56 -07:00
Dan Gohman
0e57f3d0ea Add a "colocated" flag to symbol references. (#298)
This adds a "colocated" flag to function and symbolic global variables which
indicates that they are defined along with the current function, so they can
use PC-relative addressing.

This also changes the function decl syntax; the name now always precedes the
signature, and the "function" keyword is no longer included.
2018-04-13 15:00:09 -07:00
Dan Gohman
1c760ab179 Rename intel to x86.
x86 is the more accurate name, as there are non-Intel x86 implementations.

Fixes #263.
2018-04-12 10:02:16 -07:00
Dan Gohman
9e4ab7dc86 Rename CallConv::Native to CallConv::SystemV. (#291)
To keep cross-compiling straightforward, Cretonne shouldn't have any
behavior that depends on the host. This renames the "Native" calling
convention to "SystemV", which has a defined meaning for each target,
so that it's clear that the calling convention doesn't change
depending on what host Cretonne is running on.
2018-03-30 12:32:14 -07:00
Dan Gohman
6606b88136 Optimize immediates and compare and branch sequences (#286)
* Add a pre-opt optimization to change constants into immediates.

This converts 'iadd' + 'iconst' into 'iadd_imm', and so on.

* Optimize away redundant `bint` instructions.

Cretonne has a concept of "Testable" values, which can be either boolean
or integer. When the an instruction needing a "Testable" value receives
the result of a `bint`, converting boolean to integer, eliminate the
`bint`, as it's redundant.

* Postopt: Optimize using CPU flags.

This introduces a post-legalization optimization pass which converts
compare+branch sequences to use flags values on CPUs which support it.

* Define a form of x86's `urm` that doesn't clobber FLAGS.

movzbl/movsbl/etc. don't clobber FLAGS; define a form of the `urm`
recipe that represents this.

* Implement a DCE pass.

This pass deletes instructions with no side effects and no results that
are used.

* Clarify ambiguity about "32-bit" and "64-bit" in comments.

* Add x86 encodings for icmp_imm.

* Add a testcase for postopt CPU flags optimization.

This covers the basic functionality of transforming compare+branch
sequences to use CPU flags.

* Pattern-match irsub_imm in preopt.
2018-03-30 12:30:07 -07:00
Dan Gohman
2a26b70854 Update URLs. 2018-02-23 16:16:44 -08:00
Dan Gohman
10dcfcacdb Remove support for entity variables in filecheck.
Now that the parser doesn't renumber indices, there's no need for entity
variables like $v0.
2018-02-20 17:27:46 -08:00
Jakob Stoklund Olesen
1bbc529ef9 Improve the variable ordering used by the coloring constraint solver.
The fuzzer bugs #219 and #227 are both cases where the register
allocator coloring pass "runs out of registers". What's really happening
is that the constraint solver failed to find a solution, even when one
existed.

Suppose we have three solver variables:

    v0(GPR, out, global)
    v1(GPR, in)
    v2(GPR, in, out)

And suppose registers %r0 and %r1 are available on both input and output
sides of the instruction, but only %r1 is available for global outputs.
A valid solution would be:

    v0 -> %r1
    v1 -> %r1
    v2 -> %r0

However, the solver would pick registers for the three values in
numerical order because v1 and v2 have the same domain size (=2). This
would assign v1 -> %r0 and then fail to find a free register for v2.

Fix this by prioritizing in+out variables over single-sided variables
even when their domains are equal. This means the v2 gets assigned a
register before v1, and it gets a chance to pick a register that is
still available on both in and out sides.

Also try to avoid depending on value numbers in the solver. These bugs
were hard to reproduce because a test case invariably would have
different value numbers, causing the solver to order its variables
differently and succeed. Throw in the previous solution and original
register assignments as tie breakers which are stable and not dependent
on value numbers.

This is still not a substitute for a proper solver search algorithm that
we will probably have to write eventually.

Fixes #219
Fixes #227
2018-01-19 13:31:26 -08:00
Jakob Stoklund Olesen
1e49431804 Add test case from #216.
The error exposed by this test case no longer happens after the
coalescer was rewritten to to follow the Budimlic paper. It's still a
good coalescer test.

Fixes #216 by including the test case.
2018-01-17 16:19:51 -08:00
Jakob Stoklund Olesen
dcad3fa339 Fix coloring bug with combined constraints and global values.
The Intel instruction "v1 = ushr v2, v2" will implicitly fix the output
register for v2 to %rcx because the output is tied to the first input
operand and the second input operand is fixed to %rcx.

Make sure we handle this transitive constraint when checking for
interference with the globally live registers.

Fixes #218
2018-01-17 15:51:08 -08:00
Jakob Stoklund Olesen
0a6500c99a Avoid making solver variables for fixed input constraints.
When the coloring pass sees an instruction with a fixed input register
constraint that is already satisfied, make sure to tell the solver
about it anyway.

There are situations where the solver wants to convert a value to a
solver variable, and we can't allow that if the same value is also used
for a fixed register operand.

Fixes #221.
2018-01-17 15:01:00 -08:00
Jakob Stoklund Olesen
13af22b46b Track register pressure for dead EBB parameters.
The spiller wasn't tracking register pressure correctly for dead EBB
parameters in visit_ebb_header(). Make sure we free any dead EBB
parameters.

Fixes #223
2018-01-17 13:19:08 -08:00
Jakob Stoklund Olesen
d1f236b00a Reimplement coalescer following the Budimlic paper.
The old coalescing algorithm had some algorithmic complexity issues when
dealing with large virtual registers. Reimplement to use a proper
union-find algorithm so we only need one pass through the dominator
forests for virtual registers that are interference free.

Virtual registers that do have interference are split and new registers
built.

This pass is about twice as fast as the old one when dealing with
complex virtual registers.
2018-01-16 12:32:04 -08:00
Jakob Stoklund Olesen
cacba1a58f Don't allow EBB parameters to be ghost values.
Ghost instructions and values are supposed to be stored as metadata
alongside the compiled program such that the ghost values can be
computed from the real register/stack values when the program is stopped
for debugging or de-optimization.

If we allow an EBB parameter to be a ghost value, we have no way of
computing its real value using ghost instructions. We would need to know
a complete execution trace of the stopped program to figure out which
values were passed to the ghost parameter.

Instead we require EBB parameters to be real values materialized in
registers or on the stack. We use the regclass_for_abi_type() TargetIsa
callback to determine the initial register class for these parameters.
They can then be spilled later if needed.

Fixes #215.
2018-01-11 16:48:02 -08:00
Jakob Stoklund Olesen
febe8e0e51 Allow spilling of EBB arguments.
When the spiller needs to make a register available for a conditional
branch instruction, it can be necessary to spill some of the EBB
arguments on the branch instruction. This is ok because EBB argument
values belong to the same virtual register as the corresponding EBB
parameter and we spill the whole virtreg to the same slot.

Also make sure free_regs() can handle values that are killed by the
current instruction *and* spilled.
2017-12-14 13:57:13 -06:00
Jakob Stoklund Olesen
d617d5e0f3 Use a domtree pre-order instead of a CFG RPO for coalescing.
The stack implementation if the Budimlic dominator forest doesn't work
correctly with a CFG RPO. It needs the domtree pre-order.

Also handle EBB pre-order vs inst-level preorder. Manage the stack
according to EBB dominance. Look for a dominating value by searching the
stack. This is different from the Budimlic algorithm because we're
computing the dominator tree pre-order with EBB granularity only.

Fixes #207.
2017-12-13 16:22:01 -06:00
Jakob Stoklund Olesen
a825427786 Avoid reloading spilled EBB arguments.
The coalescer makes sure that matching EBB arguments and parameters are
always in the same virtual registers, and therefore also in the same
stack slot if they are spilled.

This means that the reload pass should never rewrite an EBB argument if
the argument value is spilled. This comes up in cases where the branch
instruction needs the same value in a register:

    brnz v9, ebb3(v9)

If the virtual register containing v9 is spilled, the branch instruction
must be reloaded like:

    v52 = fill v9
    brnz v52, ebb3(v9)

The branch register argument must be rewritten, and the EBB argument
must be referring to the original stack value.

Fixes #208.
2017-12-13 15:22:05 -06:00
Pat Hickey
b5601d57c8 filetests: change hex function names to user function numbers 2017-11-23 14:08:47 -08:00
Jakob Stoklund Olesen
91b1566aca Use "test regalloc" for the register allocator tests.
These tests were only using "test compile" because it doesn't require
any filecheck directives to be present, so just stop requiring filecheck
directives for "test regalloc" and other filecheck-based test drivers.
2017-10-25 18:31:14 -07:00
Jakob Stoklund Olesen
d37126565e Also consider fixed outputs for replace_global_defines.
Fixes #178.

When an instruction with a fixed output operand defines a globally live
SSA value, we need to check if the fixed register is available in the
`regs.global` set of registers that can be used across EBB boundaries.

If the fixed output register is not available in regs.global, set the
replace_global_defines flag so the output operands are rewritten as
local values.
2017-10-25 14:28:30 -07:00
Jakob Stoklund Olesen
e8ecf1f809 Add a FixedTied constraint kind for operand constraints.
Fixes #175.

The Intel division instructions have fixed input operands that are
clobbered by fixed output operands, so the value passed as an input will
be clobbered just like a tied operand.

The FixedTied operand constraint is used to indicate a fixed input
operand that has a corresponding output operand with the same fixed
register.

Teach the spiller to teach a FixedTied operand the same as a Tied
operand constraint and make sure that the input value is killed by the
instruction.
2017-10-25 11:22:20 -07:00
Jakob Stoklund Olesen
994af598f5 Avoid interference on CFG edges.
Track allocatable registers both locally and globally: Add a second
AllocatableSet which tracks registers allocated to global values without
accounting for register diversions. Since diversions are only local to
an EBB, global values must be assigned un-diverted locations that don't
interfere.

Handle the third "global" interference domain in the constraint solver in
addition to the existing "input" and "output" domains.

Extend the solver error code to indicate when a global define just can't
be allocated because there are not enough available global registers.
Resolve this problem by replacing the instruction's global defines with
local defines that are copied into their global destinations
afterwards.
2017-10-11 15:38:30 -07:00
Jakob Stoklund Olesen
90ed698e83 Add an unreachable code elimination pass.
The register allocator doesn't even try to compile unreachable EBBs, so
any values defined in such blocks won't be assigned registers.

Since the dominator tree already has determined which EBBs are
reachable, we should just eliminate any unreachable blocks instead o
trying to do something with the dead code.

Not that this is not a "dead code elimination" pass which would also
remove individual instructions whose results are not used.
2017-10-09 15:26:27 -07:00
Jakob Stoklund Olesen
b3fa47cacc Add support for emergency spill slots.
- Create a new kind of stack slot: emergency_slot.
- Add a get_emergency_slot() method which finds a suitable emergency
  slot given a list of slots already in use.
- Use emergency spill slots when schedule_moves needs them.
2017-10-06 10:45:13 -07:00
Jakob Stoklund Olesen
fb0999ce33 Check the top-level register class for available registers.
Fixes #165.

The constraint solver's schedule_move() function sometimes need to use
an extra available register when the moves to be scheduled contains
cycles.

The pending moves have associated register classes that come from the
constraint programming. Since the moves have hard-coded to and from
registers, these register classes are only meant to indicate the
register sizes. In particular, we can use the whole top-level register
class when scavenging for a spare register to break a cycle.
2017-10-03 14:12:18 -07:00
Jakob Stoklund Olesen
c091a695e6 Fix coalescer bug exposed by the gvn-unremovable-phi test.
When we detect interference between the values that have already been
merged into the candidate virtual register and an EBB argument, we first
try to resolve the conflict by splitting. We also check if the existing
interfering value is fundamentally incompatible with the branch
instruction so it needs to be removed from the virtual register,
restarting the merge operation.

However, this existing interfering value is not necessarily the only
interference, so the split is not guaranteed to resolve the conflict. If
it turns out that splitting didn't resolve the conflict, restart the
merge after removing this second conflicting value.
2017-10-03 11:13:46 -07:00
Jakob Stoklund Olesen
5f56f81251 Resolve all value aliases when computing live ranges.
Value aliases are only in the way during register allocation, so make
sure they are all dead as we enter the register allocation passes.
2017-09-29 15:54:06 -07:00
Jakob Stoklund Olesen
c82e68efea Eliminate the ABCD register class constaint in REX encodings.
Some REX-less encodings require an ABCD input because they are looking
at 8-bit registers. This constraint doesn't apply with a REX prefix
where the low 8 bits of all registers are addressable.
2017-09-29 15:29:25 -07:00
Jakob Stoklund Olesen
51a6901a7f Implement coloring::iterate_solution().
It can happen that the currently live registers are blocking a smaller
register class completely, so the only way of solving the allocation
problem is to turn some of the live-through registers into solver
variables.

When the quick_solve attempt fails, try to free up registers in the
critical register class by turning live-through values into solver
variables.
2017-09-29 14:55:35 -07:00
Jakob Stoklund Olesen
45888ab84e Reload for spilled call return values.
When the return value from a call has been spilled, the reload pass
needs to insert a spill instruction right after the call instruction
which returns its results in registers.
2017-09-29 11:25:38 -07:00
Jakob Stoklund Olesen
84471a8431 Add some very basic support for the Intel32 ABI.
In 32-bit mode, all function arguments are passed on the stack, not in
registers.

This ABI support is not complete or properly tested, but at least it
doesn't try to pass arguments in r8.
2017-09-27 12:55:34 -07:00
Jakob Stoklund Olesen
cc3707706c Write and parse value locations for EBB arguments
Fixes #56.

We now have complete support for value location annotations in the
textual IL format. Values defined by instructions as well as EBB
arguments are covered.
2017-09-15 11:21:29 -07:00
Jakob Stoklund Olesen
0deaa616a3 Record identity assignments in regalloc constraint solver.
Fixes #147.

The Solver::reassign_in() method would previously not record fixed
register assignments for values that are already in the correct
register. The register would simply be marked as unavailable for the
solver.

This did have the effect of tripping up the sanity checks in
Solver::add_var() when that method was called with such a "reassigned"
value. The function can be called for a value that already has a fixed
assignment, but the sanity checks want to make sure the variable
constraints are compatible with the existing fixed assignment. When no
such assignment could be found, the method panicked.

To fix this, make sure that even identity reassignments are recorded
in the assignments vector. Instead, filter the identity assignments out
before scheduling a move sequence for the assignments.

Also add some debug tracing to the regalloc solver.
2017-08-29 10:45:33 -07:00
Jakob Stoklund Olesen
c96d4daa20 Add a calling convention to all function signatures.
A CallConv enum on every function signature makes it possible to
generate calls to functions with different calling conventions within
the same ISA / within a single function.

The calling conventions also serve as a way of customizing Cretonne's
behavior when embedded inside a VM. As an example, the SpiderWASM
calling convention is used to compile WebAssembly functions that run
inside the SpiderMonkey virtual machine.

All function signatures must have a calling convention at the end, so
this changes the textual IL syntax.

Before:

    sig1 = signature(i32, f64) -> f64

After

    sig1 = (i32, f64) -> f64 native
    sig2 = (i32) spiderwasm

When printing functions, the signature goes after the return types:

    function %r1() -> i32, f32 spiderwasm {
    ebb1:
        ...
    }

In the parser, this calling convention is optional and defaults to
"native". This is mostly to avoid updating all the existing test cases
under filetests/. When printing a function, the calling convention is
always included, including for "native" functions.
2017-08-03 11:40:24 -07:00
Jakob Stoklund Olesen
924c4649cc Enforce encodings for instructions with side effects.
We allow ghost instructions to exist if they have no side effects.
Instructions that affect control flow or that have other side effects
must be encoded.

Teach the IL verifier to enforce this. Once any instruction has an
encoding, all instructions with side effects must have an encoding.
2017-07-12 09:41:25 -07:00
Jakob Stoklund Olesen
f0abff3611 Handle tied operands that are not killed by their use.
Any tied register uses are interesting enough to be added to the reguses
list if their value is not killed.

A copy needs to be inserted in that case.
2017-07-05 15:48:06 -07:00
Jakob Stoklund Olesen
64f6a98abe Test a tied operand following a fixed register operand.
The redefined tied value lives in the diverted register.
2017-07-05 15:48:06 -07:00
Jakob Stoklund Olesen
b7917fe404 Test two consecutive fixed operands.
We need to move the previous value out of the way first.
2017-07-05 12:21:58 -07:00
Jakob Stoklund Olesen
e7db3f2b3a Add a test with a fixed register constraint.
Make sure we use the diverted register location for tied operands.
2017-07-05 12:08:53 -07:00
Jakob Stoklund Olesen
0d2d1ea8cf Add support for tied operands.
Include a very basic test using an Intel 'sub' instruction. More to
follow.
2017-06-30 13:36:41 -07:00
Jakob Stoklund Olesen
18dc420352 Repair constraint violations during spilling.
The following constraints may need to be resolved during spilling
because the resolution increases register pressure:

- A tied operand whose value is live through the instruction.
- A fixed register constraint for a value used more than once.
- A register use of a spilled value needs to account for the reload
  register.
2017-06-29 16:51:05 -07:00
Jakob Stoklund Olesen
138d3c75c6 Spill live-ins and EBB arguments if there are too many. 2017-06-29 14:07:19 -07:00
Jakob Stoklund Olesen
588ef0ad2f Propagate affinities for EBB arguments.
A priory, an EBB argument value only gets an affinity if it is used
directly by a non-ghost instruction. A use by a branch passing arguments
to an EBB doesn't count.

When an EBB argument value does have an affinity, the values passed by
all the predecessors must also have affinities. This can cause EBB
argument values to get affinities recursively.

- Add a second pass to the liveness computation for propagating EBB
  argument affinities, possibly recursively.
- Verify EBB argument affinities correctly: A value passed to a branch
  must have an affinity only if the corresponding EBB argument value in
  the destination has an affinity.
2017-06-29 10:30:26 -07:00
Jakob Stoklund Olesen
e7a543ea33 Make sure return values are assigned an affinity.
When an EBB argument value is used only as a return value, it still
needs to be given a register affinity. Otherwise it would appear as a
ghost value with no affinity.

Do the same to call arguments.
2017-06-29 09:24:05 -07:00
Jakob Stoklund Olesen
0574dcdeee Don't coalesce incoming stack arguments.
A function parameter in an incoming_arg stack slot should not be
coalesced into any virtual registers. We don't want to force the whole
virtual register to spill to the incoming_arg slot.
2017-06-28 15:37:38 -07:00
Jakob Stoklund Olesen
b2fda76c5f Assign stack slots to incoming function arguments.
Function arguments that don't fit in registers are passed on the stack.

Create "incoming_arg" stack slots representing the stack arguments, and
assign them to the value arguments during spilling.
2017-06-28 15:03:59 -07:00