Merge pull request #3556 from fitzgen/isle-integration-docs
Add a document describing how ISLE is integrated with Cranelift
This commit is contained in:
234
cranelift/docs/isle-integration.md
Normal file
234
cranelift/docs/isle-integration.md
Normal file
@@ -0,0 +1,234 @@
|
||||
# How ISLE is Integrated with Cranelift
|
||||
|
||||
This document contains an overview of and FAQ about how ISLE fits into
|
||||
Cranelift.
|
||||
|
||||
## What is ISLE?
|
||||
|
||||
ISLE is a domain-specific language for authoring instruction selection and
|
||||
rewrite rules. ISLE source text is [compiled down into Rust
|
||||
code](https://github.com/bytecodealliance/wasmtime/tree/main/cranelift/isle#implementation).
|
||||
|
||||
TODO: link to @cfallin's language reference/tutorial
|
||||
|
||||
## How does ISLE integrate with the build system?
|
||||
|
||||
The build integration is inside of `cranelift/codegen/build.rs`.
|
||||
|
||||
For regular builds, we check a manifest that records the file hashes of the ISLE
|
||||
source files that went into building a given ISLE-generated Rust file. If the
|
||||
hashes of these files on disk don't match the hashes in the manifest, then the
|
||||
ISLE-generated Rust file is stale and needs to be rebuilt. In this case, the
|
||||
`build.rs` will report a build error. This way, downstream crates that use
|
||||
Cranelift don't need to build ISLE, and get fewer transitive dependencies and
|
||||
faster build times.
|
||||
|
||||
To intentionally rebuild ISLE-generated Rust files, use the `rebuild-isle` Cargo
|
||||
feature with `cranelift-codegen`:
|
||||
|
||||
```shell
|
||||
$ cargo check -p cranelift-codegen --features rebuild-isle
|
||||
```
|
||||
|
||||
When this feature is active, we rerun the ISLE compiler on the ISLE sources to
|
||||
create the new versions of the ISLE-generated Rust files and update the manifest
|
||||
files.
|
||||
|
||||
Additionally, the `cranelift-codegen-meta` crate will automatically generate
|
||||
ISLE `extern` declarations and helpers for working with CLIF. The code that does
|
||||
this is defined inside `cranelift/codegen/meta/src/gen_inst.rs` and it creates
|
||||
the `cranelift/codegen/src/clif.isle` file.
|
||||
|
||||
## Where are the relevant files?
|
||||
|
||||
* `cranelift/isle`: The ISLE compiler's source code.
|
||||
|
||||
* `cranelift/codegen/src/prelude.isle`: Common definitions and declarations for
|
||||
ISLE. This gets included in every ISLE compilation.
|
||||
|
||||
* `cranelift/codegen/src/clif.isle`: Auto-generated declarations and helpers for
|
||||
working with CLIF inside ISLE. Generated by `cranelift/codegen/build.rs` when
|
||||
the `rebuild-isle` feature is enabled. This gets included in every ISLE
|
||||
compilation.
|
||||
|
||||
* `cranelift/codegen/src/machinst/isle.rs`: Common Rust code for gluing
|
||||
ISLE-generated code into a target architecture's backend. Contains
|
||||
implementations of ISA-agnostic `extern` helpers declared in ISLE.
|
||||
|
||||
* `cranelift/codegen/src/isa/<arch>/inst.isle`: ISA-specific ISLE
|
||||
helpers. Contains things like constructors for each instruction in the ISA, or
|
||||
helpers to get a specific register. Helps bridge the gap between the raw,
|
||||
non-SSA ISA and the pure, SSA view that the lowering rules have.
|
||||
|
||||
* `cranelift/codegen/src/isa/<arch>/lower.isle`: Instruction selection lowering
|
||||
rules for an ISA. These should be pure, SSA rewrite rules, that lend
|
||||
themselves to eventual verification.
|
||||
|
||||
* `cranelift/codegen/src/isa/<arch>/lower/isle.rs`: The Rust glue code for
|
||||
integrating this ISA's ISLE-generate Rust code into the rest of the backend
|
||||
for this ISA. Contains implementations of ISA-specific `extern` helpers
|
||||
declared in ISLE.
|
||||
|
||||
* `cranelift/codegen/src/isa/<arch>/lower/isle/generated_code.rs`: The
|
||||
ISLE-generated Rust code to perform instruction and CLIF-to-`MachInst`
|
||||
lowering for each target architecture.
|
||||
|
||||
## Gluing ISLE's generated code into Cranelift
|
||||
|
||||
Each ISA-specific, ISLE-generated file is generic over a `Context` trait that
|
||||
has a trait method for each `extern` helper defined in ISLE. There is one
|
||||
concrete implementation of each of these traits, defined in
|
||||
`cranelift/codegen/src/isa/<arch>/lower/isle.rs`. In general, the way that
|
||||
ISLE-generated code is glued into the rest of the system is with these trait
|
||||
implementations.
|
||||
|
||||
There may also be a `lower` function defined in `isle.rs` that encapsulates
|
||||
creating the ISLE `Context` and calling into the generated code.
|
||||
|
||||
## Lowering rules are always pure, use SSA
|
||||
|
||||
The lowering rules themselves, defined in
|
||||
`cranelift/codegen/src/isa/<arch>/lower.isle`, must always be a pure mapping
|
||||
from a CLIF instruction to the target ISA's `MachInst`.
|
||||
|
||||
Examples of things that the lowering rules themselves shouldn't deal with or
|
||||
talk about:
|
||||
|
||||
* Registers that are modified (both read and written to, violating SSA)
|
||||
* Implicit uses of registers
|
||||
* Maintaining use counts for each CLIF value or virtual register
|
||||
|
||||
Instead, these things should be handled by some combination of
|
||||
`cranelift/codegen/src/isa/<arch>/inst.isle` and general Rust code (either in
|
||||
`cranelift/codegen/src/isa/<arch>/lower/isle.rs` or elsewhere).
|
||||
|
||||
When an instruction modifies a register, both reading from it and writing to it,
|
||||
we should build an SSA view of that instruction that gets legalized via "move
|
||||
mitosis" by splitting a move out from the register.
|
||||
|
||||
For example, on x86 the `add` instruction reads and writes its first operand:
|
||||
|
||||
add a, b == a = a + b
|
||||
|
||||
So we present an SSA facade where `add` operates on three registers, instead of
|
||||
two, and defines one of them, while reading the other two and leaving them
|
||||
unmodified:
|
||||
|
||||
add a, b, c == a = b + c
|
||||
|
||||
Then, as an implementation detail of the facade, we emit moves as necessary:
|
||||
|
||||
add a, b, c ==> mov a, b; add b, c
|
||||
|
||||
We call the process of emitting these moves "move mitosis". For ISAs with
|
||||
ubiquitous use of modified registers and instructions in two-operand form, like
|
||||
x86, we implement move mitosis with methods on the ISA's `MachInst`. For other
|
||||
ISAs that are RISCier and where modified registers are pretty rare, such as
|
||||
aarch64, we implement the handful of move mitosis special cases at the
|
||||
`inst.isle` layer. Either way, the important thing is that the lowering rules
|
||||
remain pure.
|
||||
|
||||
Finally, note that these moves are generally cleaned up by the register
|
||||
allocator's move coalescing, and move mitosis will eventually go away completely
|
||||
once we switch over to `regalloc2`, which takes instructions in SSA form
|
||||
directly as input.
|
||||
|
||||
Instructions that implicitly operate on specific registers, or which require
|
||||
that certain operands be in certain registers, are handled similarly: the
|
||||
lowering rules use a pure paradigm that ignores these constraints and has
|
||||
instructions that explicitly take implicit operands, and we ensure the
|
||||
constraints are fulfilled a layer below the lowering rules (in `inst.isle` or in
|
||||
Rust glue code).
|
||||
|
||||
## When are lowering rules allowed to have side effects?
|
||||
|
||||
Extractors (the matchers that appear on the left-hand sides of `rule`s) should
|
||||
**never** have side effects. When evaluating a rule's extractors, we haven't yet
|
||||
committed to evaluating that rule's right-hand side. If the extractors performed
|
||||
side effects, we could get deeply confusing action-at-a-distance bugs where
|
||||
rules we never fully match pull the rug out from under our feet.
|
||||
|
||||
Anytime you are tempted to perform side effects in an extractor, you should
|
||||
instead just package up the things you would need in order to perform that side
|
||||
effect, and then have a separate constructor that takes that package and
|
||||
performs the side effect it describes. The constructor can only be called inside
|
||||
a rule's right-hand side, which is only evaluated after we've committed to this
|
||||
rule, which avoids the action-at-a-distance bugs described earlier.
|
||||
|
||||
For example, loads have a side effect in CLIF: they might trap. Therefore, even
|
||||
if a loaded value is never used, we will emit code that implements that
|
||||
load. But if we are compiling for x86 we can sink loads into other the operand
|
||||
for another operation depending on how the loaded value is used. If we sink that
|
||||
load into, say, an `add` then we need to tell the lowering context *not* to
|
||||
lower the CLIF `load` instruction anymore, because its effectively already
|
||||
lowered as part of lowering the `add` that uses the loaded value. Marking an
|
||||
instruction as "already lowered" is a side effect, and we might be tempted to
|
||||
perform that side effect in the extractor that matches sinkable loads. But we
|
||||
can't do that because although the load itself might be sinkable, there might be
|
||||
a reason why we ultimately don't perform this load-sinking rule, and if that
|
||||
happens we still need to lower the CLIF load.
|
||||
|
||||
Therefore, we make the `sinkable_load` extractor create a `SinkableLoad` type
|
||||
that packages up everything we need to know about the load and how to tell the
|
||||
lowering context that we've sunk it and the lowering context doesn't need to
|
||||
lower it anymore, but *it doesn't actually tell that to the lowering context
|
||||
yet*.
|
||||
|
||||
```lisp
|
||||
;; inst.isle
|
||||
|
||||
;; A load that can be sunk into another operation.
|
||||
(type SinkableLoad extern (enum))
|
||||
|
||||
;; Extract a `SinkableLoad` from a value if the value is defined by a compatible
|
||||
;; load.
|
||||
(decl sinkable_load (SinkableLoad) Value)
|
||||
(extern extractor sinkable_load sinkable_load)
|
||||
```
|
||||
|
||||
Then, we pair that with a `sink_load` constructor that takes the `SinkableLoad`,
|
||||
performs the associated side effect of telling the lowering context not to lower
|
||||
the load anymore, and returns the x86 operand with the load sunken into it.
|
||||
|
||||
```lisp
|
||||
;; inst.isle
|
||||
|
||||
;; Sink a `SinkableLoad` into a `RegMemImm.Mem`.
|
||||
;;
|
||||
;; This is a side-effectful operation that notifies the context that the
|
||||
;; instruction that produced the `SinkableImm` has been sunk into another
|
||||
;; instruction, and no longer needs to be lowered.
|
||||
(decl sink_load (SinkableLoad) RegMemImm)
|
||||
(extern constructor sink_load sink_load)
|
||||
```
|
||||
|
||||
Finally, we can use `sinkable_load` and `sink_load` inside lowering rules that
|
||||
create instructions where an operand is loaded directly from memory:
|
||||
|
||||
```
|
||||
;; lower.isle
|
||||
|
||||
(rule (lower (has_type (fits_in_64 ty)
|
||||
(iadd x (sinkable_load y))))
|
||||
(value_reg (add ty
|
||||
(put_in_reg x)
|
||||
(sink_load y))))
|
||||
```
|
||||
|
||||
See the `sinkable_load`, `SinkableLoad`, and `sink_load` declarations inside
|
||||
`cranelift/codegen/src/isa/x64/inst.isle` as well as their external
|
||||
implementations inside `cranelift/codegen/src/isa/x64/lower/isle.rs` for
|
||||
details.
|
||||
|
||||
See also the "ISLE code should leverage types" section below.
|
||||
|
||||
## ISLE code should leverage types
|
||||
|
||||
ISLE is a typed language, and we should leverage that to prevent whole classes
|
||||
of bugs where possible. Use newtypes liberally.
|
||||
|
||||
For example, use the `with_flags` family of helpers to pair flags-producing
|
||||
instructions with flags-consuming instructions, ensuring that no errant
|
||||
instructions are ever inserted between our flags-using instructions, clobbering
|
||||
their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` inside
|
||||
`cranelift/codegen/src/prelude.isle` for details.
|
||||
Reference in New Issue
Block a user