Even though the implementation of emit and emit_safepoint may
be platform-specific, the interface ought to be common so that
other code in prelude.isle may safely call these constructors.
This patch moves the definition of emit (from all platforms)
and emit_safepoint (s390x only) to prelude.isle. This required
adding an emit_safepoint implementation to aarch64 and x64 as
well - the latter is still a stub as special move mitosis
handling will be required.
Looking at [the `fcmp`
documentation](https://docs.rs/cranelift-codegen/0.80.0/cranelift_codegen/ir/trait.InstBuilder.html#method.fcmp)--generated
from Cranelift's instruction definitions, the charts explaining the
logic for the various conditions is unreadable. Since rendering those charts
as plain text is problematic, this change wraps them as code sections
for a consistent layout.
In order to migrate branches to ISLE, we define a second entry
point `lower_branch` which gets the list of branch targets as
additional argument.
This requires a small change to `lower_common`: the `isle_lower`
callback argument is changed from a function pointer to a closure.
This allows passing the extra argument via a closure.
Traps make use of the recently added facility to emit safepoints
from ISLE, but are otherwise straightforward.
Change the implementation of emitted_insts in IsleContext from
a plain vector of instructions into a vector of tuples, where
the second element is a boolean that indicates whether this
instruction should be emitted as a safepoint.
This allows targets to emit safepoint insns via ISLE.
Attempt to match a Jump instruction in ISLE will currently lead to the
generated files not compiling. This is because the definition of the
InstructionData enum in clif.isle does not match the actual type used
in Rust code.
Specifically, clif.isle erroneously omits the ValueList variable-length
argument entry if the format does not use a typevar operand. This is
the case for Jump and a few other formats. The problem is caused by
a bug in the gen_isle routine in meta/src/gen_inst.rs.
The BranchTarget abstraction is no longer needed, since all branches are
being emitted using a MachLabel target. Remove BranchTarget and simply
use MachLabel everywhere a branch target is required. (This brings the
s390x back-end in line with what x64 does as well.)
In addition, simplify jumptable emission by moving all instructions
that do not depend on the internal label (i.e. the conditional branch
to the default label, as well as the scaling the index register) out of
the combined JTSequence instruction.
This refactoring will make moving branch generation to ISLE easier.
`ptr::cast` has the advantage of being unable to silently cast
`*const T` to `*mut T`. This turned up several places that were
performing such casts, which this PR also fixes.
If a block is marked cold but has side-effect-free code that is only
used by side-effectful code in non-cold blocks, we will erroneously fail
to emit it, causing a regalloc failure.
This is due to the interaction of block ordering and lowering: we rely
on block ordering to visit uses before defs (except for backedges) so
that we can effectively do an inline liveness analysis and skip lowering
operations that are not used anywhere. This "inline DCE" is needed
because instruction lowering can pattern-match and merge one instruction
into another, removing the need to generate the source instruction.
Unfortunately, the way that I added cold-block support in #3698 was
oblivious to this -- it just changed the block sort order. For
efficiency reasons, we generate code in its final order directly, so it
would not be tenable to generate it in e.g. RPO first and then reorder
cold blocks to the bottom; we really do want to visit in the same order
as the final code.
This PR fixes the bug by moving the point at which cold blocks are sunk
to emission-time instead. This is cheaper than either trying to visit
blocks during lowering in RPO but add to VCode out-of-order, or trying
to do some expensive analysis to recover proper liveness. It's not clear
that the latter would be possible anyway -- the need to lower some
instructions depends on other instructions' isel results/merging
success, so we really do need to visit in RPO, and we can't simply lower
all instructions as side-effecting roots (some can't be toplevel nodes).
The one downside of this approach is that the VCode itself still has
cold blocks inline; so in the text format (and hence compile-tests) it's
not possible to see the sinking. This PR adds a test for cold-block
sinking that actually verifies the machine code. (The test also includes
an add-instruction in the cold path that would have been incorrectly
skipped prior to this fix.)
Fortunately this bug would not have been triggered by the one current
use of cold blocks in #3699, because there the only operation in the
cold block was an (always effectful) call instruction. The worst-case
effect of the bug in other code would be a regalloc panic; no silent
miscompilations could result.
This adds ISLE support for the s390x back-end and moves lowering
of most instructions to ISLE. The only instructions still remaining
are calls, returns, traps, and branches, most of which will need
additional support in common code.
Generated code is not intended to be (significantly) different
than before; any additional optimizations now made easier to
implement due to the ISLE layer can be added in follow-on patches.
There were a few differences in some filetests, but those are all
just simple register allocation changes (and all to the better!).
This commit adds support for denoting cold blocks in the CLIF text
format as follows:
```plain
function %f() {
block0(...):
...
block1 cold:
...
block2(...) cold:
...
block3:
...
```
With this syntax, we are able to see the cold-block flag in CLIF, we can
write tests using it, and it is preserved when round-tripping.
Fixes#3701.
In preparing to move the s390x back-end to ISLE, I noticed a few
missing pieces in the common prelude code. This patch:
- Defines the reference types $R32 / $R64.
- Provides a trap_code_bad_conversion_to_integer helper.
- Provides an avoid_div_traps helper. This requires passing the
generic flags in addition to the ISA-specifc flags into the
ISLE lowering context.
In preparing the back-end to move to ISLE, I detected a
number of codegen bugs in the existing code, which are
fixed here:
- Fix internal compiler error with uload16/icmp corner case.
- Fix broken Cls lowering.
- Correctly mask shift count for i8/i16 shifts.
In addition, I made several changes to operand encodings
in various MInst patterns. These should not have any
functional effect, but will make the ISLE migration easier:
- Encode floating-point constants as u32/u64 in MInst patterns.
- Encode shift amounts as u8 and Reg in ShiftOp pattern.
- Use MemArg in LoadMultiple64 and StoreMultiple64 patterns.
This PR adds a flag to each block that can be set via the frontend/builder
interface that indicates that the block will not be frequently
executed. As such, the compiler backend should place the block "out of
line" in the final machine code, so that the ordinary, more frequent
execution path that excludes the block does not have to jump around it.
This is useful for adding handlers for exceptional conditions
(slow-paths, guard violations) in a way that minimizes performance cost.
Fixes#2747.
* Update lots of `isa/*/*.clif` tests to `precise-output`
This commit goes through the `aarch64` and `x64` subdirectories and
subjectively changes tests from `test compile` to add `precise-output`.
This then auto-updates all the test expectations so they can be
automatically instead of manually updated in the future. Not all tests
were migrated, largely subject to the whims of myself, mainly looking to
see if the test was looking for specific instructions or just checking
the whole assembly output.
* Filter out `;;` comments from test expctations
Looks like the cranelift parser picks up all comments, not just those
trailing the function, so use a convention where `;;` is used for
human-readable-comments in test cases and `;`-prefixed comments are the
test expectation.
* cranelift: Add ability to auto-update test expectations
One of the problems of the current `*.clif` testing is that the files
are difficult to update when widespread changes are made (such as
removing modification of the frame pointer). Additionally when changing
register allocation or similar it can cause a large number of changes in
tests but the tests themselves didn't actually break. For this reason
this commit adds the ability to automatically update test expectations.
The idea behind this commit is that tests of the form `test compile` can
also optionally be flagged with the `precise-output` flag:
test compile precise-output
and when doing so the compiled form of each function is asserted to 100%
match the following comments and their test expectations. If a match is
not found then a `BLESS=1` environment variable can be used to
automatically rewrite the test file itself with the correct assertion.
If the environment variable isn't present and the expectation doesn't
match then the test fails.
It's hoped that, if approved, a follow-up commit can add
`precise-output` to all current `test compile` tests (or make it the
default) and all tests can be mass-updated. When developing locally test
expectations need not be written and instead tests can be run with
`BLESS=1` and the output can be manually verified. The environment
variable will not be present on CI which means that changes to the
output which don't also change the test expectation will cause CI to
fail. Furthermore this should still make updates to the test output
easily readable in review on CI because the test expectations are
intended to look the same as before.
Closes#1539
* Use raw vcode output in tests
* Fix a merge conflict
* Review comments