Commit Graph

715 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
3eeef1c752 Add some missing instructions to the language reference. 2018-02-06 09:55:53 -08:00
Julian Seward
6f8a54b6a5 Adds support for legalizing CLZ, CTZ and POPCOUNT on baseline x86_64 targets.
Changes:

* Adds a new generic instruction, SELECTIF, that does value selection (a la
  conditional move) similarly to existing SELECT, except that it is
  controlled by condition code input and flags-register inputs.

* Adds a new Intel x86_64 variant, 'baseline', that supports SSE2 and
  nothing else.

* Adds new Intel x86_64 instructions BSR and BSF.

* Implements generic CLZ, CTZ and POPCOUNT on x86_64 'baseline' targets
  using the new BSR, BSF and SELECTIF instructions.

* Implements SELECTIF on x86_64 targets using conditional-moves.

* new test filetests/isa/intel/baseline_clz_ctz_popcount.cton
  (for legalization)

* new test filetests/isa/intel/baseline_clz_ctz_popcount_encoding.cton
  (for encoding)

* Allow lib/cretonne/meta/gen_legalizer.py to generate non-snake-caseified
  Rust without rustc complaining.

Fixes #238.
2018-02-06 09:43:00 -08:00
Jakob Stoklund Olesen
e3714ddd10 Add a func.inst_offsets() iterator.
This Function method can be used after the final code layout has been
computed. It returns all the instructions in an EBB along with their
encoded size and offset from the beginning of the function.

This is useful for extracting additional metadata about trapping
instructions and other things that may be needed by a VM.
2018-02-01 17:19:05 -08:00
Jakob Stoklund Olesen
584a33bca7 Give better error messages in "test binemit".
When an instruction can't be encoded, provide a bit more help:

- Detect missing register assignments for input and output operands.
- List encodings that where considered and rejected.
2018-01-29 09:16:33 -08:00
Jakob Stoklund Olesen
ef2640d8a4 Add information about SpiderMonkey and rustc plans. 2018-01-25 15:48:28 -08:00
Jakob Stoklund Olesen
1bbc529ef9 Improve the variable ordering used by the coloring constraint solver.
The fuzzer bugs #219 and #227 are both cases where the register
allocator coloring pass "runs out of registers". What's really happening
is that the constraint solver failed to find a solution, even when one
existed.

Suppose we have three solver variables:

    v0(GPR, out, global)
    v1(GPR, in)
    v2(GPR, in, out)

And suppose registers %r0 and %r1 are available on both input and output
sides of the instruction, but only %r1 is available for global outputs.
A valid solution would be:

    v0 -> %r1
    v1 -> %r1
    v2 -> %r0

However, the solver would pick registers for the three values in
numerical order because v1 and v2 have the same domain size (=2). This
would assign v1 -> %r0 and then fail to find a free register for v2.

Fix this by prioritizing in+out variables over single-sided variables
even when their domains are equal. This means the v2 gets assigned a
register before v1, and it gets a chance to pick a register that is
still available on both in and out sides.

Also try to avoid depending on value numbers in the solver. These bugs
were hard to reproduce because a test case invariably would have
different value numbers, causing the solver to order its variables
differently and succeed. Throw in the previous solution and original
register assignments as tie breakers which are stable and not dependent
on value numbers.

This is still not a substitute for a proper solver search algorithm that
we will probably have to write eventually.

Fixes #219
Fixes #227
2018-01-19 13:31:26 -08:00
Tyler McMullen
14e39db428 Add filetest for statically out-of-bound heap addresses. 2018-01-18 15:49:10 -08:00
Tyler McMullen
df210bfdea Fix the Intel x64 PIC 'call' test, adding correct addend. 2018-01-18 14:23:00 -08:00
Jakob Stoklund Olesen
1e49431804 Add test case from #216.
The error exposed by this test case no longer happens after the
coalescer was rewritten to to follow the Budimlic paper. It's still a
good coalescer test.

Fixes #216 by including the test case.
2018-01-17 16:19:51 -08:00
Jakob Stoklund Olesen
dcad3fa339 Fix coloring bug with combined constraints and global values.
The Intel instruction "v1 = ushr v2, v2" will implicitly fix the output
register for v2 to %rcx because the output is tied to the first input
operand and the second input operand is fixed to %rcx.

Make sure we handle this transitive constraint when checking for
interference with the globally live registers.

Fixes #218
2018-01-17 15:51:08 -08:00
Jakob Stoklund Olesen
0a6500c99a Avoid making solver variables for fixed input constraints.
When the coloring pass sees an instruction with a fixed input register
constraint that is already satisfied, make sure to tell the solver
about it anyway.

There are situations where the solver wants to convert a value to a
solver variable, and we can't allow that if the same value is also used
for a fixed register operand.

Fixes #221.
2018-01-17 15:01:00 -08:00
Jakob Stoklund Olesen
13af22b46b Track register pressure for dead EBB parameters.
The spiller wasn't tracking register pressure correctly for dead EBB
parameters in visit_ebb_header(). Make sure we free any dead EBB
parameters.

Fixes #223
2018-01-17 13:19:08 -08:00
Jakob Stoklund Olesen
d1f236b00a Reimplement coalescer following the Budimlic paper.
The old coalescing algorithm had some algorithmic complexity issues when
dealing with large virtual registers. Reimplement to use a proper
union-find algorithm so we only need one pass through the dominator
forests for virtual registers that are interference free.

Virtual registers that do have interference are split and new registers
built.

This pass is about twice as fast as the old one when dealing with
complex virtual registers.
2018-01-16 12:32:04 -08:00
Yury Delendik
567e570c02 Allow to print translated wasm file. 2018-01-12 13:12:50 -08:00
Jakob Stoklund Olesen
cacba1a58f Don't allow EBB parameters to be ghost values.
Ghost instructions and values are supposed to be stored as metadata
alongside the compiled program such that the ghost values can be
computed from the real register/stack values when the program is stopped
for debugging or de-optimization.

If we allow an EBB parameter to be a ghost value, we have no way of
computing its real value using ghost instructions. We would need to know
a complete execution trace of the stopped program to figure out which
values were passed to the ghost parameter.

Instead we require EBB parameters to be real values materialized in
registers or on the stack. We use the regclass_for_abi_type() TargetIsa
callback to determine the initial register class for these parameters.
They can then be spilled later if needed.

Fixes #215.
2018-01-11 16:48:02 -08:00
Jakob Stoklund Olesen
5e094034d4 Fix verifier bug in unreachable code.
We want to disable dominance checks in unreachable code. The
is_reachable() check for EBB parameter values was checking if the
defining EBB was reachable, not the EBB using the value.

This bug showed up in fuzzing and in #213.
2018-01-09 10:47:49 -08:00
Dan Gohman
4f53cc1dad Align IntelGOTPCRel4 with R_X86_64_GOTPCREL.
Add an addend field to reloc_external, and use it to move the
responsibility for accounting for the difference between the end of an
instruction (where the PC is considered to be in PC-relative on intel)
and the beginning of the immediate field into the encoding code.

Specifically, this makes IntelGOTPCRel4 directly correspond to
R_X86_64_GOTPCREL, instead of also carrying an implicit `- 4`.
2017-12-15 16:17:32 -06:00
Dan Gohman
76e31cc1ad Rename GotPCRel4 to GOTPCRel4.
This emphasizes that GOT is being used as an abbreviation rather than
the word "got".
2017-12-15 16:17:32 -06:00
Jakob Stoklund Olesen
febe8e0e51 Allow spilling of EBB arguments.
When the spiller needs to make a register available for a conditional
branch instruction, it can be necessary to spill some of the EBB
arguments on the branch instruction. This is ok because EBB argument
values belong to the same virtual register as the corresponding EBB
parameter and we spill the whole virtreg to the same slot.

Also make sure free_regs() can handle values that are killed by the
current instruction *and* spilled.
2017-12-14 13:57:13 -06:00
Jakob Stoklund Olesen
d617d5e0f3 Use a domtree pre-order instead of a CFG RPO for coalescing.
The stack implementation if the Budimlic dominator forest doesn't work
correctly with a CFG RPO. It needs the domtree pre-order.

Also handle EBB pre-order vs inst-level preorder. Manage the stack
according to EBB dominance. Look for a dominating value by searching the
stack. This is different from the Budimlic algorithm because we're
computing the dominator tree pre-order with EBB granularity only.

Fixes #207.
2017-12-13 16:22:01 -06:00
Jakob Stoklund Olesen
a825427786 Avoid reloading spilled EBB arguments.
The coalescer makes sure that matching EBB arguments and parameters are
always in the same virtual registers, and therefore also in the same
stack slot if they are spilled.

This means that the reload pass should never rewrite an EBB argument if
the argument value is spilled. This comes up in cases where the branch
instruction needs the same value in a register:

    brnz v9, ebb3(v9)

If the virtual register containing v9 is spilled, the branch instruction
must be reloaded like:

    v52 = fill v9
    brnz v52, ebb3(v9)

The branch register argument must be rewritten, and the EBB argument
must be referring to the original stack value.

Fixes #208.
2017-12-13 15:22:05 -06:00
Pat Hickey
ed81bc21be filetests: add filetests for intel PIC encodings 2017-12-12 19:29:52 -08:00
Pat Hickey
88b30ff386 refactor Reloc to an enum of every architecture's reloc types
https://github.com/stoklund/cretonne/pull/206#issuecomment-350905016
2017-12-12 13:57:10 -08:00
Jakob Stoklund Olesen
a888b2a6f1 Dominator tree pre-order.
Add a DominatorTreePreorder data structure which can be initialized for
a DominatorTree and used for queries involving a pre-order of the
dominator tree.

Print out the pre-order and send it through filecheck in "test domtree"
file tests.
2017-12-08 17:43:15 -08:00
Jakob Stoklund Olesen
7d5f2f0404 Convert the CFG traversal tests to file tests.
Add a "cfg_postorder:" printout to the "test domtree" file tests and use
that to check the computed CFG post-order instead of doing it manually
with Rust code.
2017-12-08 13:58:18 -08:00
Jakob Stoklund Olesen
a7eb13a151 Expand unknown instructions to runtime library calls. 2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen
f03729d742 Fix generated code for ISA predicates on encoding recipes.
The generated code had syntax errors and inverted logic.

Add an SSE 4.1 requirement to the floating point rounding instructions.
2017-12-08 10:37:50 -08:00
Jakob Stoklund Olesen
60c456c1ec Add a compilation pass timing facility.
Individual compilation passes call the corresponding timing::*()
function and hold on to their timing token while they run. This causes
nested per-pass timing information to be recorded in thread-local
storage.

The --time-passes command line option prints a pass timing report to
stdout.
2017-12-06 17:04:23 -08:00
Jakob Stoklund Olesen
f106e4266a Enable the IL verifier by default.
Change the default value for the "enable_verifier" setting so the
verifier runs unless it is explicitly disabled.

Most projects using Cretonne are best off running the verifier always
until they start caring about compile time performance. Then they can
easily disable the verifier.
2017-12-06 08:30:48 -08:00
Tyler McMullen
7988d0c54c Add 8-bit variation of adjust_sp_imm for 32-bit and 64-bit Intel. 2017-12-05 11:49:12 -08:00
Tyler McMullen
5783ea2c9a Account for return address when reserving stack space for CSRs. 2017-12-05 11:49:12 -08:00
Tyler McMullen
a75248d2cf Move the initial stack pointer adjustment to after the CSR pushes. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ebcbd54f61 Add 'compile' test and confirm the pro/epilogue is added. Fix regression this revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
ced39f5186 Fix up adjust_sp_imm instruction.
* Use imm64 rather than offset32
* Add predicate to enforce signed 32-bit limit to imm
* Remove AdjustSpImm format
* Add encoding tests for adjust_sp_imm
* Adjust use of adjust_sp_imm in Intel prologue_epilogue to match
2017-12-05 11:49:12 -08:00
Tyler McMullen
1a11c351b5 Add tests and documentation for x86_(push|pop). Fix up encoding issues revealed by tests. 2017-12-05 11:49:12 -08:00
Tyler McMullen
3b1b33e0ac Add docs and tests for copy_special instruction. Fixes encoding issue that tests revealed. 2017-12-05 11:49:12 -08:00
Tyler McMullen
6ec4bfc4ca Fix up the encodings for new instructions, both expected and actual. Make the test more accurate. 2017-12-05 11:49:12 -08:00
Tyler McMullen
fdfe24760a Add missing newline to prologue epilogue test 2017-12-05 11:49:12 -08:00
Tyler McMullen
d4311d2b1d Add prologue-epilogue test that exercises new instructions and binary emission. 2017-12-05 11:49:12 -08:00
Jakob Stoklund Olesen
04f6ccabe5 Allow filecheck directives with "test compile".
Things like inserted prologues and epilogues in #201 can be tested this
way.
2017-12-04 09:44:06 -08:00
Pat Hickey
cced2c8b0c Fix wat syntax so wasm tests pass (#199)
* wasm testsuite: ignore hidden files in test dir

and report a rejected file. it was picking up vim .swp files

* wasmtests: correct wat syntax in icall.wat
2017-11-27 12:46:53 -08:00
Pat Hickey
b5601d57c8 filetests: change hex function names to user function numbers 2017-11-23 14:08:47 -08:00
Jakob Stoklund Olesen
c810b21488 Be more forgiving about what's a "slow" test.
This was supposed to be Q3 + 1.5 IQR, but a braino meant we actually used
Q3 + 2/3 IQR.

Since the distribution of test case times is far from gaussian, bump the
"slow" limit up even further to Q3 + 3 IQR.
2017-11-22 11:48:47 -08:00
Jakob Stoklund Olesen
8e2ce6ded2 Revert "Enable pager in cton-util."
This reverts commit 0538615ccc0b600d4f534dae2ee966d5ed0df9b7.

Fixes #196. The pager functionality wasn't working as intended since
long error messages appear on stdout which isn't captured by the pager.
2017-11-22 11:35:03 -08:00
Jakob Stoklund Olesen
92f378de76 Expose CFG predecessors only as an iterator.
Define two public iterator types in the flowgraph module, PredIter and
SuccIter, which are by-value iterators over an EBB's predecessors and
successors respectively.

Provide matching pred_iter() and succ_iter() methods for inspecting the
CFG. Remove the get_predecessors() method which returned a slice.

Update the uses of get_predecessors(), none of which depended on it
being a slice.

This abstraction makes it possible to change the internal representation
of the CFG.
2017-11-22 09:13:04 -08:00
Jakob Stoklund Olesen
cf45afa1e7 Avoid the CFG get_successors() when computing a post-order.
The control flow graph does not guarantee any particular ordering for
its successor lists, and the post-order we are computing for building
the dominator tree needs to be "split-invariant".

See #146 for details.

- Discover EBB successors directly from the EBB instruction sequence to
  guarantee that the post-order we compute is canonical/split-invariant.
- Use an alternative graph DFS algorithm which doesn't require indexing
  into a slice of successors.

This changes cfg_postorder in some cases because the edge pruning when
converting the (DAG) CFG to a tree for the DFT is different.
2017-11-21 14:20:57 -08:00
Dan Gohman
4c829f7c7f Fix sphinx hyperlink syntax. 2017-11-14 14:09:35 -08:00
Dan Gohman
648c1b33ba Fix sphinx hyperlink syntax. 2017-11-13 14:05:47 -08:00
Dan Gohman
78f2edefc2 Add todos for add/sub with signed overflow, saturating fcvt_to_[su]int. 2017-11-11 17:45:09 -08:00
Dan Gohman
54e4ab71d9 Enable pager in cton-util. 2017-11-10 09:09:00 -08:00