Commit Graph

1293 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
1e2b7de141 Remove dead code. 2018-01-16 12:34:32 -08:00
Jakob Stoklund Olesen
d1f236b00a Reimplement coalescer following the Budimlic paper.
The old coalescing algorithm had some algorithmic complexity issues when
dealing with large virtual registers. Reimplement to use a proper
union-find algorithm so we only need one pass through the dominator
forests for virtual registers that are interference free.

Virtual registers that do have interference are split and new registers
built.

This pass is about twice as fast as the old one when dealing with
complex virtual registers.
2018-01-16 12:32:04 -08:00
Jakob Stoklund Olesen
16ac4f65b3 Add support for textbook union-find to VirtRegs.
The initial phase of computing virtual registers can now be implemented
with a textbook union-find algorithm using a disjoint set forest
complete with rank and path compression optimizations.

The disjoint set forest is converted to virtual register value lists in
a single linear scan implemented in finish_union_find().

This union-find algorithm will soon be used by the coalescer.
2018-01-16 12:32:04 -08:00
Jakob Stoklund Olesen
ce4cc8ce12 Fix the handling of special types in type variables.
- Allow the syntax "specials=True" to indicate that a type variable can
  assume all special types. Use this for the unconstrained type variable
  created in ast.py.
- Fix TypeSet.copy() to avoid deepcopy() which doesn't do the right
  thing for the self.specials set.
- Fix TypeSet.typeset_key() to just use the name of special types
  instead of the full SpecialType objects.
2018-01-16 10:29:31 -08:00
Jakob Stoklund Olesen
85aab278dd Add RISC-V encodings for b1 copy/spill/fill.
We allow b1 values in general purpose registers, so we need to be able
to move them around.
2018-01-16 09:19:22 -08:00
Jakob Stoklund Olesen
cacba1a58f Don't allow EBB parameters to be ghost values.
Ghost instructions and values are supposed to be stored as metadata
alongside the compiled program such that the ghost values can be
computed from the real register/stack values when the program is stopped
for debugging or de-optimization.

If we allow an EBB parameter to be a ghost value, we have no way of
computing its real value using ghost instructions. We would need to know
a complete execution trace of the stopped program to figure out which
values were passed to the ghost parameter.

Instead we require EBB parameters to be real values materialized in
registers or on the stack. We use the regclass_for_abi_type() TargetIsa
callback to determine the initial register class for these parameters.
They can then be spilled later if needed.

Fixes #215.
2018-01-11 16:48:02 -08:00
Jakob Stoklund Olesen
5e094034d4 Fix verifier bug in unreachable code.
We want to disable dominance checks in unreachable code. The
is_reachable() check for EBB parameter values was checking if the
defining EBB was reachable, not the EBB using the value.

This bug showed up in fuzzing and in #213.
2018-01-09 10:47:49 -08:00
Jakob Stoklund Olesen
af89006b09 Fix some markdown issues.
Work around some cases where the old markdown parser differs from the
new Pulldown parser for the documentation.
2018-01-08 16:19:16 -08:00
Jakob Stoklund Olesen
4afa19ddff Fix some mypy errors.
It looks like mypy 0.560 doesn't like when a local variable changes its
type inside a function.

Fixes introduce a new variable instead of reusing an existing one.
2018-01-03 12:13:13 -08:00
Dan Gohman
4f53cc1dad Align IntelGOTPCRel4 with R_X86_64_GOTPCREL.
Add an addend field to reloc_external, and use it to move the
responsibility for accounting for the difference between the end of an
instruction (where the PC is considered to be in PC-relative on intel)
and the beginning of the immediate field into the encoding code.

Specifically, this makes IntelGOTPCRel4 directly correspond to
R_X86_64_GOTPCREL, instead of also carrying an implicit `- 4`.
2017-12-15 16:17:32 -06:00
Dan Gohman
76e31cc1ad Rename GotPCRel4 to GOTPCRel4.
This emphasizes that GOT is being used as an abbreviation rather than
the word "got".
2017-12-15 16:17:32 -06:00
Jakob Stoklund Olesen
fc857247e4 Fix overlaps_def for dead live ranges.
A dead live range ends at the same point it is defined, but it is still
considered to overlap a def at the same program point.
2017-12-14 17:16:19 -06:00
Jakob Stoklund Olesen
66073eb26c Better verifier error for coinciding defs.
If a virtual register contains values that a defined at the same program
point, say so. Don't cryptically claim that one dominates the other.
2017-12-14 17:04:16 -06:00
Jakob Stoklund Olesen
febe8e0e51 Allow spilling of EBB arguments.
When the spiller needs to make a register available for a conditional
branch instruction, it can be necessary to spill some of the EBB
arguments on the branch instruction. This is ok because EBB argument
values belong to the same virtual register as the corresponding EBB
parameter and we spill the whole virtreg to the same slot.

Also make sure free_regs() can handle values that are killed by the
current instruction *and* spilled.
2017-12-14 13:57:13 -06:00
Jakob Stoklund Olesen
d617d5e0f3 Use a domtree pre-order instead of a CFG RPO for coalescing.
The stack implementation if the Budimlic dominator forest doesn't work
correctly with a CFG RPO. It needs the domtree pre-order.

Also handle EBB pre-order vs inst-level preorder. Manage the stack
according to EBB dominance. Look for a dominating value by searching the
stack. This is different from the Budimlic algorithm because we're
computing the dominator tree pre-order with EBB granularity only.

Fixes #207.
2017-12-13 16:22:01 -06:00
Jakob Stoklund Olesen
2473661d49 Loosen the required order of values in a virtual register.
Instead of requiring the values in a virtual register to be sorted
according to the domtree.rpo_cmp() order, just require any topological
ordering w.r.t. dominance.

The coalescer with stop using the RPO shortly.
2017-12-13 15:25:21 -06:00
Jakob Stoklund Olesen
a825427786 Avoid reloading spilled EBB arguments.
The coalescer makes sure that matching EBB arguments and parameters are
always in the same virtual registers, and therefore also in the same
stack slot if they are spilled.

This means that the reload pass should never rewrite an EBB argument if
the argument value is spilled. This comes up in cases where the branch
instruction needs the same value in a register:

    brnz v9, ebb3(v9)

If the virtual register containing v9 is spilled, the branch instruction
must be reloaded like:

    v52 = fill v9
    brnz v52, ebb3(v9)

The branch register argument must be rewritten, and the EBB argument
must be referring to the original stack value.

Fixes #208.
2017-12-13 15:22:05 -06:00
Pat Hickey
d444044e9e intel isa: comments to explain rip-relative addressing encoding 2017-12-12 19:29:52 -08:00
Pat Hickey
6d44debc18 intel: add PIC variants to recipes and encodings 2017-12-12 19:29:52 -08:00
Pat Hickey
5834520bfe binemit: add PIC relocation types for Intel 2017-12-12 19:29:52 -08:00
Pat Hickey
90bc798e4f settings: add "is_pic" boolean setting to base 2017-12-12 19:29:52 -08:00
Pat Hickey
88b30ff386 refactor Reloc to an enum of every architecture's reloc types
https://github.com/stoklund/cretonne/pull/206#issuecomment-350905016
2017-12-12 13:57:10 -08:00
Jakob Stoklund Olesen
a888b2a6f1 Dominator tree pre-order.
Add a DominatorTreePreorder data structure which can be initialized for
a DominatorTree and used for queries involving a pre-order of the
dominator tree.

Print out the pre-order and send it through filecheck in "test domtree"
file tests.
2017-12-08 17:43:15 -08:00
Jakob Stoklund Olesen
a7eb13a151 Expand unknown instructions to runtime library calls. 2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen
f03729d742 Fix generated code for ISA predicates on encoding recipes.
The generated code had syntax errors and inverted logic.

Add an SSE 4.1 requirement to the floating point rounding instructions.
2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen
362a4bdc4c Add well-known names for runtime library functions.
Add a LibCall type which represents runtime library functions that many
be synthesized by Cretonne from pure instructions.

Add a LibCall variant to ExternalName to represent one of these runtime
functions.
2017-12-07 17:50:22 -08:00
Jakob Stoklund Olesen
60c456c1ec Add a compilation pass timing facility.
Individual compilation passes call the corresponding timing::*()
function and hold on to their timing token while they run. This causes
nested per-pass timing information to be recorded in thread-local
storage.

The --time-passes command line option prints a pass timing report to
stdout.
2017-12-06 17:04:23 -08:00
Jakob Stoklund Olesen
feaea238bc Use bforest::Map for representing live ranges.
Get rid of the per-value Vec in the LiveRange data type and use a
bforest::Map instead to represent the live-in intervals for non-local
live ranges.

This has some advantages:

- The memory footprint of a local live range is reduced from 40 to 20
  bytes, and
- Clearing the Liveness data structure is now a constant time operation
  which doesn't call free().
- The potentially quadratic behavior when computing large live ranges is
  controlled by the logarithmic B-tree operations.
2017-12-06 14:14:21 -08:00
Jakob Stoklund Olesen
27d5543adc Make LiveRange a type alias for GenLiveRange<Layout>.
This makes the whole LiveRange generic over the program order instead of
having a number of methods that are individually program order-generic.
This makes is possible to have data members that depend on the program
order, as we will shortly.

This also gives us stronger type checking on the public LiveRange
methods which now require a Layout argument, not just any program order.
2017-12-06 13:53:24 -08:00
Jakob Stoklund Olesen
f106e4266a Enable the IL verifier by default.
Change the default value for the "enable_verifier" setting so the
verifier runs unless it is explicitly disabled.

Most projects using Cretonne are best off running the verifier always
until they start caring about compile time performance. Then they can
easily disable the verifier.
2017-12-06 08:30:48 -08:00
Jakob Stoklund Olesen
c09ad06f96 Stop generating reserved_reg heaps in DummyEnvironment.
The reserved register heaps are not implemented in the Cretonne
legalizer, so IR generated by the dummy environment would trip
assertions when compiled.

Use a heap with a vmctx base address instead, and also demonstrate how
vmctx arguments are added to all signatures to achieve this.
2017-12-05 16:31:10 -08:00
Jakob Stoklund Olesen
b8fe6bf0f5 Add a MapCursor::value_mut() method.
It's ok to alter a value stored in a map, but not the keys.
2017-12-05 15:07:28 -08:00
Jakob Stoklund Olesen
c64428b698 Add a Map::get_or_less() method.
Find the largest (k,v) pair with k <= key.
2017-12-05 15:07:28 -08:00
Tyler McMullen
7988d0c54c Add 8-bit variation of adjust_sp_imm for 32-bit and 64-bit Intel. 2017-12-05 11:49:12 -08:00
Tyler McMullen
3b937f5917 Add separate spiderwasm prologue/epilogue to intel's abi.rs 2017-12-05 11:49:12 -08:00
Tyler McMullen
5783ea2c9a Account for return address when reserving stack space for CSRs. 2017-12-05 11:49:12 -08:00
Tyler McMullen
a75248d2cf Move the initial stack pointer adjustment to after the CSR pushes. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ebcbd54f61 Add 'compile' test and confirm the pro/epilogue is added. Fix regression this revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
694658b949 Move entirety of prologue_epilogue logic to abi module. 2017-12-05 11:49:12 -08:00
Tyler McMullen
0fb59dc589 Fix the ordering of return values. 2017-12-05 11:49:12 -08:00
Tyler McMullen
c156eb9ff7 Refactor prologue_epilogue. Break out into functions. Remove Vecs. 2017-12-05 11:49:12 -08:00
Tyler McMullen
c78a191294 Use layout.last_inst to find 'return' opcodes, rather than iterating. 2017-12-05 11:49:12 -08:00
Tyler McMullen
66eccb7859 Use opcode's is_return() rather than pattern-matching. 2017-12-05 11:49:12 -08:00
Tyler McMullen
a26d438b30 Use returned Value from append_ebb_param in prologue_epilogue. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ced39f5186 Fix up adjust_sp_imm instruction.
* Use imm64 rather than offset32
* Add predicate to enforce signed 32-bit limit to imm
* Remove AdjustSpImm format
* Add encoding tests for adjust_sp_imm
* Adjust use of adjust_sp_imm in Intel prologue_epilogue to match
2017-12-05 11:49:12 -08:00
Tyler McMullen
1a11c351b5 Add tests and documentation for x86_(push|pop). Fix up encoding issues revealed by tests. 2017-12-05 11:49:12 -08:00
Tyler McMullen
3b1b33e0ac Add docs and tests for copy_special instruction. Fixes encoding issue that tests revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
4eb9a54096 Convert x86_(push|pop) operations to be explicitly limited to 32-bit and 64-bit values. 2017-12-05 11:49:12 -08:00
Tyler McMullen
2f3edc1bc6 Fix issue in which CSR returns were incorrectly ordered. 2017-12-05 11:49:12 -08:00
Tyler McMullen
6ec4bfc4ca Fix up the encodings for new instructions, both expected and actual. Make the test more accurate. 2017-12-05 11:49:12 -08:00