The main purpose of the DCE pass is to clean up dead code left behind by
the optimizer, so it's not valuable to run it when the optimizer isn't
being run.
`iter()` iterates over both keys and values, while `values()` iterates over
just values. Also add `_mut()` versions.
These replace the otherwise common idiom of iterating with `keys()` and using
indexing to get the values, allowing for simpler code.
To keep cross-compiling straightforward, Cretonne shouldn't have any
behavior that depends on the host. This renames the "Native" calling
convention to "SystemV", which has a defined meaning for each target,
so that it's clear that the calling convention doesn't change
depending on what host Cretonne is running on.
* Add a pre-opt optimization to change constants into immediates.
This converts 'iadd' + 'iconst' into 'iadd_imm', and so on.
* Optimize away redundant `bint` instructions.
Cretonne has a concept of "Testable" values, which can be either boolean
or integer. When the an instruction needing a "Testable" value receives
the result of a `bint`, converting boolean to integer, eliminate the
`bint`, as it's redundant.
* Postopt: Optimize using CPU flags.
This introduces a post-legalization optimization pass which converts
compare+branch sequences to use flags values on CPUs which support it.
* Define a form of x86's `urm` that doesn't clobber FLAGS.
movzbl/movsbl/etc. don't clobber FLAGS; define a form of the `urm`
recipe that represents this.
* Implement a DCE pass.
This pass deletes instructions with no side effects and no results that
are used.
* Clarify ambiguity about "32-bit" and "64-bit" in comments.
* Add x86 encodings for icmp_imm.
* Add a testcase for postopt CPU flags optimization.
This covers the basic functionality of transforming compare+branch
sequences to use CPU flags.
* Pattern-match irsub_imm in preopt.
* First draft of TrapSink implementation.
* Add trap sink calls to 'trapif' and 'trapff' recipes.
* Add SourceLoc to trap sink calls, and add trap sink calls to all loads and stores.
* Add IntegerDivisionByZero trap to div recipe.
* Only emit load/store traps if 'notrap' flag is not set on the instruction.
* Update filetest machinery to add new trap sink functionality.
* Update filetests to include traps in output.
* Add a few more trap outputs to filetests.
* Add trap output to CLI tool.
While the specifics of these terms are debatable, "IR" generally
isn't incorrect in this context, and is the more widely recognized
term at this time.
See also the discussion in #267.
Fixes#267.
This prevents uses of undefined values from passsing through
unnoticed, and ensures that all aliases are ultimately resolved,
regardless of where they are defined.
Value aliases aren't instructions, so they don't have a location in the
CFG, so it's not meaningful to query whether a value alias is defined
within a loop.
It's easy to forget whether they mutate the value in place or return a
new value. Marking them #[must_use] will catch cases where they are used
incorrectly.
EFLAGS is a subregister of RFLAGS. For consistency with GPRs where we
use the 64-bit names to refer to the registers, use the 64-bit name for
RFLAGS as well.
Mark loads from globals generated by cton_wasm or by legalization as
`aligned` and `notrap`, since memory for these globals should be
allocated by the runtime environment for that purpose. This reduces
the number of potentially trapping instructions, which can reduce
the amount of metadata required by embedding environments.
There appear to be underlying problems with the way Cretonne handles value
aliases, which are causing problems for LICM. Disable LICM until we have
a chance to fix the underlying issues.
Fixes#275.
* cton-util: fix some clippy unnecessary pass-by-value warnings
* clippy: ignore too many arguments / cyclomatic complexity in module
since these functions are taking args coming from the command line, i
dont think this is actually a valid lint, morally the arguments are all
from one structure
* cton-util: take care of remaining clippy warnings
* cton-reader: fix all non-suspicious clippy warnings
* cton-reader: disable clippy at site of suspicious lint
* cton-frontend: disable clippy at the site of an invalid lint
* cton-frontend: fix clippy warnings, or ignore benign ones
* clippy: ignore the camelcase word WebAssembly in docs
* cton-wasm: fix clippy complaints or ignore benign ones
* cton-wasm tests: fix clippy complaints
* cretonne: starting point turns off all clippy warnings
* cretonne: clippy fixes, or lower allow() to source of problem
* cretonne: more clippy fixes
* cretonne: fix or disable needless_lifetimes lint
this linter is buggy when the declared lifetime is used for another type
constraint.
* cretonne: fix clippy complaint about Pass::NoPass
* rustfmt
* fix prev minor api changes clippy suggested
* add clippy to test-all
* cton-filetests: clippy fixes
* simplify clippy reporting in test-all
* cretonne: document clippy allows better
* cretonne: fix some more clippy lints
* cretonne: fix clippy lints (mostly doc comments)
* cretonne: allow all needless_lifetimes clippy warnings
remove overrides at the false positives
* rustfmt
Merge the `use` parts of the `no_std` branch. This reduces the diffs
between master and the `no_std` branch, making it easier to maintain.
Most of these changes are derived from patches by @lachlansneff in
https://github.com/Cretonne/cretonne/tree/no_std.
In this case, it's a little nicer to just use more assertions,
which will print their line number indicating how far the test
got, when they fail.
And this allows the tests to be run in no_std configurations.
This allows us to run the tests via a library call rather than just
as a command execution. And, it's a step toward a broader goal, which
is to keep the code in the top-level src directory minimal, with
important functionality exposed as crates.
Refactor the filetests harness so that it can be run as part of
`cargo test`. And begin reorganizing the test harness code in preparation
for moving it out of the src directory.
- Test subcommand files are now named `test_*.rs`.
- cton-util subcommand files now just export their `run` and nothing else.
- src/filetest/mod.rs now also just exports `run` and nothing else.
- Tests are now run in release mode (with debug assertions enabled).
Compute the bound values for expand_fcvt_to_sint using bitwise integer
arithmetic rather than floating-point arithmetic, to avoid relying on
host floating point arithmetic.
This allows the assertions to be disabled in release builds, so that
the code is faster and smaller, at the expense of not performing the
checks. Assertions can be re-enabled in release builds with the
debug-assertions flag in Cargo.toml, as the top-level Cargo.toml
file does.
When relaxing a branch, restrict the set of candidate encodings to those which
have the same input constraints as the original encoding choice. This prevents
situations where relaxation prefers a non-REX-prefixed encoding over a REX
prefixed one because the end of the instruction can be one byte closer to the
destination, in a situation where the encoding needs to be REX-prefixed
because of one of the operand registers.
This also makes the Context class perform encoding verification after
relaxation, to catch similar problems in the future.
Fixes#256.
Emergency stack slots are a new kind of stack slot added relatively
recently. They need to be allocated a stack offset just like explicit
and spill slots.
Also, make StackSlotData's offset field an Option, to catch problems
like this in the future. Previously the value 0 was used when offsets
weren't assigned yet, however that made it non-obvious when the field
meant "not assigned yet" and when it meant "assigned the value 0".
Spiderwasm on 32-bit x86 always uses a 16-byte-aligned stack pointer.
Change the setting for the "native" convention as well, for
compatibility with Linux and Darwin ABIs, and so that if a platform
has different ABI rules, the problem will be detected in code emitted by
Cretonne, rather than somewhere else.
The term "local variables" predated the SSA builder in the front-end
crate, which also provides a way to implement source-language local
variables. The name "explicit stack slot" makes it clear what this
construct is.