236 lines
10 KiB
Markdown
236 lines
10 KiB
Markdown
# How ISLE is Integrated with Cranelift
|
|
|
|
This document contains an overview of and FAQ about how ISLE fits into
|
|
Cranelift.
|
|
|
|
## What is ISLE?
|
|
|
|
ISLE is a domain-specific language for authoring instruction selection and
|
|
rewrite rules. ISLE source text is [compiled down into Rust
|
|
code](https://github.com/bytecodealliance/wasmtime/tree/main/cranelift/isle#implementation).
|
|
|
|
Documentation on the ISLE language itself can be found
|
|
[here](../isle/docs/language-reference.md).
|
|
|
|
## How does ISLE integrate with the build system?
|
|
|
|
The build integration is inside of `cranelift/codegen/build.rs`.
|
|
|
|
For regular builds, we check a manifest that records the file hashes of the ISLE
|
|
source files that went into building a given ISLE-generated Rust file. If the
|
|
hashes of these files on disk don't match the hashes in the manifest, then the
|
|
ISLE-generated Rust file is stale and needs to be rebuilt. In this case, the
|
|
`build.rs` will report a build error. This way, downstream crates that use
|
|
Cranelift don't need to build ISLE, and get fewer transitive dependencies and
|
|
faster build times.
|
|
|
|
To intentionally rebuild ISLE-generated Rust files, use the `rebuild-isle` Cargo
|
|
feature with `cranelift-codegen`:
|
|
|
|
```shell
|
|
$ cargo check -p cranelift-codegen --features rebuild-isle
|
|
```
|
|
|
|
When this feature is active, we rerun the ISLE compiler on the ISLE sources to
|
|
create the new versions of the ISLE-generated Rust files and update the manifest
|
|
files.
|
|
|
|
Additionally, the `cranelift-codegen-meta` crate will automatically generate
|
|
ISLE `extern` declarations and helpers for working with CLIF. The code that does
|
|
this is defined inside `cranelift/codegen/meta/src/gen_inst.rs` and it creates
|
|
the `cranelift/codegen/src/clif.isle` file.
|
|
|
|
## Where are the relevant files?
|
|
|
|
* `cranelift/isle`: The ISLE compiler's source code.
|
|
|
|
* `cranelift/codegen/src/prelude.isle`: Common definitions and declarations for
|
|
ISLE. This gets included in every ISLE compilation.
|
|
|
|
* `cranelift/codegen/src/clif.isle`: Auto-generated declarations and helpers for
|
|
working with CLIF inside ISLE. Generated by `cranelift/codegen/build.rs` when
|
|
the `rebuild-isle` feature is enabled. This gets included in every ISLE
|
|
compilation.
|
|
|
|
* `cranelift/codegen/src/machinst/isle.rs`: Common Rust code for gluing
|
|
ISLE-generated code into a target architecture's backend. Contains
|
|
implementations of ISA-agnostic `extern` helpers declared in ISLE.
|
|
|
|
* `cranelift/codegen/src/isa/<arch>/inst.isle`: ISA-specific ISLE
|
|
helpers. Contains things like constructors for each instruction in the ISA, or
|
|
helpers to get a specific register. Helps bridge the gap between the raw,
|
|
non-SSA ISA and the pure, SSA view that the lowering rules have.
|
|
|
|
* `cranelift/codegen/src/isa/<arch>/lower.isle`: Instruction selection lowering
|
|
rules for an ISA. These should be pure, SSA rewrite rules, that lend
|
|
themselves to eventual verification.
|
|
|
|
* `cranelift/codegen/src/isa/<arch>/lower/isle.rs`: The Rust glue code for
|
|
integrating this ISA's ISLE-generate Rust code into the rest of the backend
|
|
for this ISA. Contains implementations of ISA-specific `extern` helpers
|
|
declared in ISLE.
|
|
|
|
* `cranelift/codegen/src/isa/<arch>/lower/isle/generated_code.rs`: The
|
|
ISLE-generated Rust code to perform instruction and CLIF-to-`MachInst`
|
|
lowering for each target architecture.
|
|
|
|
## Gluing ISLE's generated code into Cranelift
|
|
|
|
Each ISA-specific, ISLE-generated file is generic over a `Context` trait that
|
|
has a trait method for each `extern` helper defined in ISLE. There is one
|
|
concrete implementation of each of these traits, defined in
|
|
`cranelift/codegen/src/isa/<arch>/lower/isle.rs`. In general, the way that
|
|
ISLE-generated code is glued into the rest of the system is with these trait
|
|
implementations.
|
|
|
|
There may also be a `lower` function defined in `isle.rs` that encapsulates
|
|
creating the ISLE `Context` and calling into the generated code.
|
|
|
|
## Lowering rules are always pure, use SSA
|
|
|
|
The lowering rules themselves, defined in
|
|
`cranelift/codegen/src/isa/<arch>/lower.isle`, must always be a pure mapping
|
|
from a CLIF instruction to the target ISA's `MachInst`.
|
|
|
|
Examples of things that the lowering rules themselves shouldn't deal with or
|
|
talk about:
|
|
|
|
* Registers that are modified (both read and written to, violating SSA)
|
|
* Implicit uses of registers
|
|
* Maintaining use counts for each CLIF value or virtual register
|
|
|
|
Instead, these things should be handled by some combination of
|
|
`cranelift/codegen/src/isa/<arch>/inst.isle` and general Rust code (either in
|
|
`cranelift/codegen/src/isa/<arch>/lower/isle.rs` or elsewhere).
|
|
|
|
When an instruction modifies a register, both reading from it and writing to it,
|
|
we should build an SSA view of that instruction that gets legalized via "move
|
|
mitosis" by splitting a move out from the register.
|
|
|
|
For example, on x86 the `add` instruction reads and writes its first operand:
|
|
|
|
add a, b == a = a + b
|
|
|
|
So we present an SSA facade where `add` operates on three registers, instead of
|
|
two, and defines one of them, while reading the other two and leaving them
|
|
unmodified:
|
|
|
|
add a, b, c == a = b + c
|
|
|
|
Then, as an implementation detail of the facade, we emit moves as necessary:
|
|
|
|
add a, b, c ==> mov a, b; add b, c
|
|
|
|
We call the process of emitting these moves "move mitosis". For ISAs with
|
|
ubiquitous use of modified registers and instructions in two-operand form, like
|
|
x86, we implement move mitosis with methods on the ISA's `MachInst`. For other
|
|
ISAs that are RISCier and where modified registers are pretty rare, such as
|
|
aarch64, we implement the handful of move mitosis special cases at the
|
|
`inst.isle` layer. Either way, the important thing is that the lowering rules
|
|
remain pure.
|
|
|
|
Finally, note that these moves are generally cleaned up by the register
|
|
allocator's move coalescing, and move mitosis will eventually go away completely
|
|
once we switch over to `regalloc2`, which takes instructions in SSA form
|
|
directly as input.
|
|
|
|
Instructions that implicitly operate on specific registers, or which require
|
|
that certain operands be in certain registers, are handled similarly: the
|
|
lowering rules use a pure paradigm that ignores these constraints and has
|
|
instructions that explicitly take implicit operands, and we ensure the
|
|
constraints are fulfilled a layer below the lowering rules (in `inst.isle` or in
|
|
Rust glue code).
|
|
|
|
## When are lowering rules allowed to have side effects?
|
|
|
|
Extractors (the matchers that appear on the left-hand sides of `rule`s) should
|
|
**never** have side effects. When evaluating a rule's extractors, we haven't yet
|
|
committed to evaluating that rule's right-hand side. If the extractors performed
|
|
side effects, we could get deeply confusing action-at-a-distance bugs where
|
|
rules we never fully match pull the rug out from under our feet.
|
|
|
|
Anytime you are tempted to perform side effects in an extractor, you should
|
|
instead just package up the things you would need in order to perform that side
|
|
effect, and then have a separate constructor that takes that package and
|
|
performs the side effect it describes. The constructor can only be called inside
|
|
a rule's right-hand side, which is only evaluated after we've committed to this
|
|
rule, which avoids the action-at-a-distance bugs described earlier.
|
|
|
|
For example, loads have a side effect in CLIF: they might trap. Therefore, even
|
|
if a loaded value is never used, we will emit code that implements that
|
|
load. But if we are compiling for x86 we can sink loads into the operand
|
|
for another operation depending on how the loaded value is used. If we sink that
|
|
load into, say, an `add` then we need to tell the lowering context *not* to
|
|
lower the CLIF `load` instruction anymore, because its effectively already
|
|
lowered as part of lowering the `add` that uses the loaded value. Marking an
|
|
instruction as "already lowered" is a side effect, and we might be tempted to
|
|
perform that side effect in the extractor that matches sinkable loads. But we
|
|
can't do that because although the load itself might be sinkable, there might be
|
|
a reason why we ultimately don't perform this load-sinking rule, and if that
|
|
happens we still need to lower the CLIF load.
|
|
|
|
Therefore, we make the `sinkable_load` extractor create a `SinkableLoad` type
|
|
that packages up everything we need to know about the load and how to tell the
|
|
lowering context that we've sunk it and the lowering context doesn't need to
|
|
lower it anymore, but *it doesn't actually tell that to the lowering context
|
|
yet*.
|
|
|
|
```lisp
|
|
;; inst.isle
|
|
|
|
;; A load that can be sunk into another operation.
|
|
(type SinkableLoad extern (enum))
|
|
|
|
;; Extract a `SinkableLoad` from a value if the value is defined by a compatible
|
|
;; load.
|
|
(decl sinkable_load (SinkableLoad) Value)
|
|
(extern extractor sinkable_load sinkable_load)
|
|
```
|
|
|
|
Then, we pair that with a `sink_load` constructor that takes the `SinkableLoad`,
|
|
performs the associated side effect of telling the lowering context not to lower
|
|
the load anymore, and returns the x86 operand with the load sunken into it.
|
|
|
|
```lisp
|
|
;; inst.isle
|
|
|
|
;; Sink a `SinkableLoad` into a `RegMemImm.Mem`.
|
|
;;
|
|
;; This is a side-effectful operation that notifies the context that the
|
|
;; instruction that produced the `SinkableImm` has been sunk into another
|
|
;; instruction, and no longer needs to be lowered.
|
|
(decl sink_load (SinkableLoad) RegMemImm)
|
|
(extern constructor sink_load sink_load)
|
|
```
|
|
|
|
Finally, we can use `sinkable_load` and `sink_load` inside lowering rules that
|
|
create instructions where an operand is loaded directly from memory:
|
|
|
|
```
|
|
;; lower.isle
|
|
|
|
(rule (lower (has_type (fits_in_64 ty)
|
|
(iadd x (sinkable_load y))))
|
|
(value_reg (add ty
|
|
(put_in_reg x)
|
|
(sink_load y))))
|
|
```
|
|
|
|
See the `sinkable_load`, `SinkableLoad`, and `sink_load` declarations inside
|
|
`cranelift/codegen/src/isa/x64/inst.isle` as well as their external
|
|
implementations inside `cranelift/codegen/src/isa/x64/lower/isle.rs` for
|
|
details.
|
|
|
|
See also the "ISLE code should leverage types" section below.
|
|
|
|
## ISLE code should leverage types
|
|
|
|
ISLE is a typed language, and we should leverage that to prevent whole classes
|
|
of bugs where possible. Use newtypes liberally.
|
|
|
|
For example, use the `with_flags` family of helpers to pair flags-producing
|
|
instructions with flags-consuming instructions, ensuring that no errant
|
|
instructions are ever inserted between our flags-using instructions, clobbering
|
|
their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` inside
|
|
`cranelift/codegen/src/prelude.isle` for details.
|