Commit Graph

45 Commits

Author SHA1 Message Date
Dan Gohman
d566faa8fb Disable preopt at opt_level=fastest. 2018-03-28 22:13:13 -07:00
Dan Gohman
e5ec7242cc Fix handling of value aliases, and re-enable LICM.
Value aliases aren't instructions, so they don't have a location in the
CFG, so it's not meaningful to query whether a value alias is defined
within a loop.
2018-03-28 22:06:52 -07:00
Dan Gohman
3b0a9b9ecf Remove an unused argument. 2018-03-27 11:53:10 -07:00
Dan Gohman
9602b78320 Disable the LICM pass for now.
There appear to be underlying problems with the way Cretonne handles value
aliases, which are causing problems for LICM. Disable LICM until we have
a chance to fix the underlying issues.

Fixes #275.
2018-03-22 16:55:18 -07:00
Pat Hickey
03ee007624 Use clippy (#276)
* cton-util: fix some clippy unnecessary pass-by-value warnings

* clippy: ignore too many arguments / cyclomatic complexity in module

since these functions are taking args coming from the command line, i
dont think this is actually a valid lint, morally the arguments are all
from one structure

* cton-util: take care of remaining clippy warnings

* cton-reader: fix all non-suspicious clippy warnings

* cton-reader: disable clippy at site of suspicious lint

* cton-frontend: disable clippy at the site of an invalid lint

* cton-frontend: fix clippy warnings, or ignore benign ones

* clippy: ignore the camelcase word WebAssembly in docs

* cton-wasm: fix clippy complaints or ignore benign ones

* cton-wasm tests: fix clippy complaints

* cretonne: starting point turns off all clippy warnings

* cretonne: clippy fixes, or lower allow() to source of problem

* cretonne: more clippy fixes

* cretonne: fix or disable needless_lifetimes lint

this linter is buggy when the declared lifetime is used for another type
constraint.

* cretonne: fix clippy complaint about Pass::NoPass

* rustfmt

* fix prev minor api changes clippy suggested

* add clippy to test-all

* cton-filetests: clippy fixes

* simplify clippy reporting in test-all

* cretonne: document clippy allows better

* cretonne: fix some more clippy lints

* cretonne: fix clippy lints (mostly doc comments)

* cretonne: allow all needless_lifetimes clippy warnings

remove overrides at the false positives

* rustfmt
2018-03-22 13:10:41 -07:00
Dan Gohman
40ec50d0b6 Don't relax a branch to have different input constraints.
When relaxing a branch, restrict the set of candidate encodings to those which
have the same input constraints as the original encoding choice. This prevents
situations where relaxation prefers a non-REX-prefixed encoding over a REX
prefixed one because the end of the instruction can be one byte closer to the
destination, in a situation where the encoding needs to be REX-prefixed
because of one of the operand registers.

This also makes the Context class perform encoding verification after
relaxation, to catch similar problems in the future.

Fixes #256.
2018-03-08 02:34:41 -08:00
Dan Gohman
227baaadb8 Enable the simple_gvn and licm passes at OptLevel::Best. 2018-02-28 11:50:59 -08:00
Julian Seward
7054f25abb Adds support to transform integer div and rem by constants into cheaper equivalents.
Adds support for transforming integer division and remainder by constants
into sequences that do not involve division instructions.

* div/rem by constant powers of two are turned into right shifts, plus some
  fixups for the signed cases.

* div/rem by constant non-powers of two are turned into double length
  multiplies by a magic constant, plus some fixups involving shifts,
  addition and subtraction, that depends on the constant, the word size and
  the signedness involved.

* The following cases are transformed: div and rem, signed or unsigned, 32
  or 64 bit.  The only un-transformed cases are: unsigned div and rem by
  zero, signed div and rem by zero or -1.

* This is all incorporated within a new transformation pass, "preopt", in
  lib/cretonne/src/preopt.rs.

* In preopt.rs, fn do_preopt() is the main driver.  It is designed to be
  extensible to transformations of other kinds of instructions.  Currently
  it merely uses a helper to identify div/rem transformation candidates and
  another helper to perform the transformation.

* In preopt.rs, fn get_div_info() pattern matches to find candidates, both
  cases where the second arg is an immediate, and cases where the second
  arg is an identifier bound to an immediate at its definition point.

* In preopt.rs, fn do_divrem_transformation() does the heavy lifting of the
  transformation proper.  It in turn uses magic{S,U}{32,64} to calculate the
  magic numbers required for the transformations.

* There are many test cases for the transformation proper:
    filetests/preopt/div_by_const_non_power_of_2.cton
    filetests/preopt/div_by_const_power_of_2.cton
    filetests/preopt/rem_by_const_non_power_of_2.cton
    filetests/preopt/rem_by_const_power_of_2.cton
    filetests/preopt/div_by_const_indirect.cton
  preopt.rs also contains a set of tests for magic number generation.

* The main (non-power-of-2) transformation requires instructions that return
  the high word of a double-length multiply.  For this, instructions umulhi
  and smulhi have been added to the core instruction set.  These will map
  directly to single instructions on most non-intel targets.

* intel does not have an instruction exactly like that.  For intel,
  instructions x86_umulx and x86_smulx have been added.  These map to real
  instructions and return both result words.  The intel legaliser will
  rewrite {s,u}mulhi into x86_{s,u}mulx uses that throw away the lower half
  word.  Tests:
    filetests/isa/intel/legalize-mulhi.cton (new file)
    filetests/isa/intel/binary64.cton (added x86_{s,u}mulx encoding tests)
2018-02-28 11:41:36 -08:00
Pat Hickey
3f69581d03 cretonne::Context: add for_function constructor 2018-01-25 18:14:57 -08:00
Jakob Stoklund Olesen
60c456c1ec Add a compilation pass timing facility.
Individual compilation passes call the corresponding timing::*()
function and hold on to their timing token while they run. This causes
nested per-pass timing information to be recorded in thread-local
storage.

The --time-passes command line option prints a pass timing report to
stdout.
2017-12-06 17:04:23 -08:00
Dan Gohman
4d9aedbaca Add a 'clear()' function to Context.
This includes adding `clear()` functions to its (transitive) members.
2017-11-15 11:15:30 -08:00
Dan Gohman
3ab4349c1b Use Self instead of repeating the type name. 2017-11-08 10:43:11 -08:00
Jakob Stoklund Olesen
1a04c4260f Remove an unused import to silence a compiler warning. 2017-10-11 14:20:43 -07:00
Dan Gohman
3f30171b79 Actually disable simple_gvn and licm by default.
See

https://github.com/stoklund/cretonne/pull/164#discussion_r142449999

for details.
2017-10-10 16:28:29 -07:00
Jakob Stoklund Olesen
90ed698e83 Add an unreachable code elimination pass.
The register allocator doesn't even try to compile unreachable EBBs, so
any values defined in such blocks won't be assigned registers.

Since the dominator tree already has determined which EBBs are
reachable, we should just eliminate any unreachable blocks instead o
trying to do something with the dead code.

Not that this is not a "dead code elimination" pass which would also
remove individual instructions whose results are not used.
2017-10-09 15:26:27 -07:00
Dan Gohman
7c7b1651d8 Do a full compile in 'cton-util wasm'.
This removes the `optimize` option, as one can do that with
`--set`, eg. `--set opt_level=best`. And it adds an option to
print the compilation output.

And, it enables simple_gvn and licm for opt_level=best.
2017-10-03 09:39:07 -07:00
Dan Gohman
ed6630dc02 Move verify calls back into Context, using FlagsOrIsa.
With FlagsOrIsa, we can pass around the information we need to run
the verifier from the Context even when a TargetIsa is not available.
2017-09-20 16:42:13 -07:00
Jakob Stoklund Olesen
1349a6bdbc Always require a Flags reference for verifying functions.
Add a settings::FlagsOrIsa struct which represents a flags reference and
optionally the ISA it belongs to. Use this for passing flags/isa
information to the verifier.

The verify_function() and verify_context() functions are now generic so
they accept either a &Flags or a &TargetISa argument.

Fix the return_at_end verifier tests which no longer require an ISA
specified. The signle "set return_at_end" flag setting now makes it to
the verifier even when no ISA is present to carry it.
2017-09-14 17:51:15 -07:00
Dan Gohman
bbe056bf9d Make passes assert their dependencies consistently. (#156)
* Make passes assert their dependencies consistently.

This avoids ambiguity about whose responsibility it is to run
to compute cfg, domtree, and loop_analysis data.

* Reset the `valid` flag in DominatorTree's `clear()`.

* Remove the redundant assert from DominatorTree::with_function.

* Remove the message strings from obvious asserts.

This avoids having them spill out into multiple lines.

* Refactor calls to `compute` on `Context` objects into helper functions.
2017-09-14 14:38:53 -07:00
Dan Gohman
9d0d6b5a1b Move verify calls out of Context.
Also, move flowgraph() calls out of filetest and into the passes that
need them so that filetest doesn't have embedded knowledge of these
dependencies.

This resolves a TODO about the way Context was running the verifier, and
it makes the Context functions and the filetest runners more transparent.

This also fixes simple_gvn to use the existing dominator tree rather
than computing its own.
2017-09-12 16:32:47 -07:00
Dan Gohman
69f8174c03 Move ensure_domtree out of Context and into DominatorTree.
This also moves the calls to it out of Context and into the passes that
actually need it, so that Context's functions don't have any logic of
their own.
2017-09-12 16:32:47 -07:00
Dan Gohman
2efdc0ed37 Update rustfmt to 0.9.0. 2017-08-31 10:44:59 -07:00
Jakob Stoklund Olesen
6d9198d55f Recompute the dominator tree on demand.
The legalizer can invalidate the dominator tree, but we don't actually
need a dominator tree during legalization, so defer the construction of
the domtree.

- Add an "invalid" state to the dominator tree along with clear() and
  is_valid() methods to test it.
- Invalidate the dominator tree as part of legalization.
- Ensure that a valid dominator tree exists before the passes that need
  it.

Together these features add up to a manual invalidation mechanism for
the dominator tree.
2017-08-28 11:16:29 -07:00
Jakob Stoklund Olesen
fecbcbb7b4 Drop the domtree argument to legalize_function().
Future legalization patterns will have the ability to mutate the
flowgraph, so the domtree's list of RPO blocks is not a good guide for
iteration. Use the layout order instead. This will pick up any new EBBs
inserted.
2017-08-25 10:36:25 -07:00
Jakob Stoklund Olesen
92392c6041 Add a prologue_epilogue() hook to TargetIsa.
This will compute the stack frame layout as appropriate for the
function's calling convention and insert prologue and epilogue code.

The default implementation is not useful, each target ISA will need to
override this function.
2017-08-03 13:48:30 -07:00
Jakob Stoklund Olesen
2f7057b96f Add a Context::emit_to_memory function.
This function will emit the binary machine code into contiguous raw
memory while sending relocations to a RelocSink.

Add a MemoryCodeSink for generating machine code directly into memory
efficiently. Allow the TargetIsa to provide emit_function
implementations that are specialized to the MemoryCodeSink type to avoid
needless small virtual callbacks to put1() et etc.
2017-07-18 08:03:53 -07:00
Jakob Stoklund Olesen
9e3b6a6eba Add a Context::compile() function which runs all compiler passes.
This is the main entry point to the code generator. It returns the
computed size of the functions code.

Also add a 'test compile' command which runs the whole code generation
pipeline.
2017-07-12 12:22:49 -07:00
Dan Gohman
0c7316ae28 Lint fixes (#99)
* Replace a single-character string literal with a character literal.

* Use is_some() instead of comparing with Some(_).

* Add code-quotes around type names in comments.

* Use !...is_empty() instead of len() != 0.

* Tidy up redundant returns.

* Remove redundant .clone() calls.

* Remove unnecessary explicit lifetime parameters.

* Tidy up unnecessary '&'s.

* Add parens to make operator precedence explicit.

* Use debug_assert_eq instead of debug_assert with ==.

* Replace a &Vec argument with a &[...].

* Replace `a = a op b` with `a op= b`.

* Avoid unnecessary closures.

* Avoid .iter() and .iter_mut() for iterating over containers.

* Remove unneeded qualification.
2017-06-19 16:24:10 -07:00
Jakob Stoklund Olesen
f22461b4b3 Stop using cfg.postorder_ebbs().
Switch to the new domtree.cfg_postorder() which returns a reference to a
pre-computed post-order instead of allocating memory and computing a new
post-order.
2017-06-07 13:38:27 -07:00
Denis Merigoux
e47f4a49fb LICM pass (#87)
* LICM pass

* Uses loop analysis to detect loop tree
* For each loop (starting with the inner ones), create a pre-header and move there loop-invariant instructions
* An instruction is loop invariant if it does not use as argument a value defined earlier in the loop
* File tests to check LICM's correctness
* Optimized pre-header creation
If the loop already has a natural pre-header, we use it instead of creating a new one.
The natural pre-header of a loop is the only predecessor of the header it doesn't dominate.
2017-06-07 11:27:22 -07:00
Denis Merigoux
b02ccea8dc Loop analysis of the IL
* Implemented in two passes
* First pass discovers the loops headers (they dominate one of their predecessors)
* Second pass traverses the blocks of each loop
* Discovers the loop tree structure
* Offers a new LoopAnalysis data structure queried from outside the module
2017-06-02 15:30:45 -07:00
Dan Gohman
c826aefa0a Start a very simple GVN pass (#79)
* Skeleton simple_gvn pass.
* Basic testing infrastructure for simple-gvn.
* Add can_load and can_store flags to instructions.
* Move the replace_values function into the DataFlowGraph.
* Make InstructionData derive from Hash, PartialEq, and Eq.
* Make EntityList's hash and eq functions panic.
* Change Ieee32 and Ieee64 to store u32 and u64, respectively.
2017-05-18 18:18:57 -07:00
Eric Anholt
43ce26e64b Verify that the instruction encoding matches what the ISA would encode.
Fixes #69
2017-04-23 17:21:32 -07:00
Jakob Stoklund Olesen
d0d5f3bb26 Run the post-regalloc verification inside the regalloc context.
This means that we can verify the basics with verify_context before
moving on to verifying the liveness information.

Live ranges are now verified immediately after computing them and after
register allocation is complete.
2017-04-21 16:25:24 -07:00
Jakob Stoklund Olesen
c5da572ebb Add a liveness verifier.
The liveness verifier will check that the live ranges are consistent
with the function. It runs as part of the register allocation pipeline
when enable_verifier is set.

The initial implementation checks the live ranges, but not the
ISA-specific constraints and affinities.
2017-04-21 16:01:08 -07:00
Jakob Stoklund Olesen
c4b794f7cf Run the verifier in the Context methods when it is enabled.
The test drivers can stop calling comp_ctx.verify because legalize() and
regalloc() do it themselves now.

This also makes it possible for those two passes to return other
CtonError codes in the future, not just verifier errors.
2017-04-21 12:36:35 -07:00
Jakob Stoklund Olesen
6e95b08df1 Verifier results are always void.
No need for a type parameter.
2017-04-21 12:03:05 -07:00
Jakob Stoklund Olesen
1984c96f7c rustfmt 0.8.1 2017-04-05 09:00:11 -07:00
Jakob Stoklund Olesen
4c3e590bb9 Run verifier after legalizer and regalloc file tests.
Run the verify_contexti() function after invoking the legalize() and
regalloc() context functions. This will help catch bad code produced by
these passes.
2017-03-29 15:19:03 -07:00
Jakob Stoklund Olesen
ca6e402b90 Add a ControlFlowGraph argument to legalize_function.
Legalizing some instructions may require modifications to the control
flow graph, and some operations need to use the CFG analysis.

The CFG reference is threaded through all the legalization functions to
reach the generated expansion functions as well as the legalizer::split
module where it will be used first.
2017-03-21 16:00:28 -07:00
Jakob Stoklund Olesen
a9056f699e Rename the 'cfg' module to 'flowgraph'.
The 'cfg' name was easy to confuse with 'configuration'.
2017-03-21 15:33:23 -07:00
Jakob Stoklund Olesen
bf9cf09622 Add a register allocation context module.
Collect the data structures that hang around between function
compilations.

Provide a main entry point to the register allocator passes.
2017-02-22 11:53:01 -08:00
Jakob Stoklund Olesen
85fa68023c Make the DominatorTree reusable.
Add a compute() method which can recompute a dominator tree for
different functions.

Add a dominator tree data to the cretonne::Context.
2017-02-17 13:09:41 -08:00
Jakob Stoklund Olesen
77a7ad88f4 Make the ControlFlowGraph reusable.
Move the flow graph computation into a compute method which can be
called with multiple functions.

This allows us to reuse the ControlFlowGraph memory and keep an instance
in the Context.
2017-02-17 12:20:33 -08:00
Jakob Stoklund Olesen
1992890f85 Add a compilation context struct.
This will provide main entry points for compiling functions, and it
serves as a place for keeping data structures that should be preserved
between function compilations to reduce allocator thrashing.

So far, Context is just basic scaffolding. More to be added.
2017-02-17 12:04:53 -08:00