From f148b50201a8c2ecb6e02dc8f166cc5eacc4d571 Mon Sep 17 00:00:00 2001 From: Nick Fitzgerald Date: Fri, 19 Nov 2021 14:19:57 -0800 Subject: [PATCH 1/2] Add a document describing how ISLE is integrated with Cranelift --- cranelift/docs/isle-integration.md | 188 +++++++++++++++++++++++++++++ 1 file changed, 188 insertions(+) create mode 100644 cranelift/docs/isle-integration.md diff --git a/cranelift/docs/isle-integration.md b/cranelift/docs/isle-integration.md new file mode 100644 index 0000000000..8e4880d3b9 --- /dev/null +++ b/cranelift/docs/isle-integration.md @@ -0,0 +1,188 @@ +# How ISLE is Integrated with Cranelift + +This document contains an overview of and FAQ about how ISLE fits into +Cranelift. + +## What is ISLE? + +ISLE is a domain-specific language for authoring instruction selection and +rewrite rules. ISLE source text is [compiled down into Rust +code](https://github.com/bytecodealliance/wasmtime/tree/main/cranelift/isle#implementation). + +TODO: link to @cfallin's language reference/tutorial + +## How does ISLE integrate with the build system? + +The build integration is inside of `cranelift/codegen/build.rs`. + +For regular builds, we check a manifest that records the file hashes of the ISLE +source files that went into building a given ISLE-generated Rust file. If the +hashes of these files on disk doesn't match the hashes in the manifest, then the +ISLE-generated Rust file is stale and needs to be rebuilt. In this case, the +`build.rs` will report a build error. This way, downstream crates that use +Cranelift don't need to build ISLE, and get fewer transitive dependencies and +faster build times. + +To intentionally rebuild ISLE-generated Rust files, use the `rebuild-isle` Cargo +feature with `cranelift-codegen`: + +```shell +$ cargo check -p cranelift-codegen --features rebuild-isle +``` + +When this feature is active, we rerun the ISLE compiler on the ISLE sources to +create the new versions of the ISLE-generated Rust files and update the manifest +files. + +Additionally, the `cranelift-codegen-meta` crate will automatically generate +ISLE `extern` declarations and helpers for working with CLIF. The code that does +this is defined inside `cranelift/codegen/meta/src/gen_inst.rs` and it creates +the `cranelift/codegen/src/clif.isle` file. + +## Where are the relevant files? + +* `cranelift/isle`: The ISLE compiler's source code. + +* `cranelift/codegen/src/prelude.isle`: Common definitions and declarations for + ISLE. This gets included in every ISLE compilation. + +* `cranelift/codegen/src/clif.isle`: Auto-generated declarations and helpers for + working with CLIF inside ISLE. Generated by `cranelift/codegen/build.rs` when + the `rebuild-isle` feature is enabled. This gets included in every ISLE + compilation. + +* `cranelift/codegen/src/machinst/isle.rs`: Common Rust code for gluing + ISLE-generated code into a target architecture's backend. Contains + implementations of ISA-agnostic `extern` helpers declared in ISLE. + +* `cranelift/codegen/src/isa//inst.isle`: ISA-specific ISLE + helpers. Contains things like constructors for each instruction in the ISA, or + helpers to get a specific register. Helps bridge the gap between the raw, + non-SSA ISA and the pure, SSA view that the lowering rules have. + +* `cranelift/codegen/src/isa//lower.isle`: Instruction selection lowering + rules for an ISA. These should be pure, SSA rewrite rules, that lend + themselves to eventual verification. + +* `cranelift/codegen/src/isa//lower/isle.rs`: The Rust glue code for + integrating this ISA's ISLE-generate Rust code into the rest of the backend + for this ISA. Contains implementations of ISA-specific `extern` helpers + declared in ISLE. + +* `cranelift/codegen/src/isa//lower/isle/generated_code.rs`: The + ISLE-generated Rust code to perform instruction and CLIF-to-`MachInst` + lowering for each target architecture. + +## Gluing ISLE's generated code into Cranelift + +Each ISA-specific, ISLE-generated file is generic over a `Context` trait that +has a trait method for each `extern` helper defined in ISLE. There is one +concrete implementation of each of these traits, defined in +`cranelift/codegen/src/isa//lower/isle.rs`. In general, the way that +ISLE-generated code is glued into the rest of the system is with these trait +implementations. + +There may also be a `lower` function defined in `isle.rs` that encapsulates +creating the ISLE `Context` and calling into the generated code. + +## Lowering rules are always pure, use SSA + +The lowering rules themselves, defined in +`cranelift/codegen/src/isa//lower.isle`, must always be a pure mapping +from a CLIF instruction to the target ISA's `MachInst`. + +Examples of things that the lowering rules themselves they shouldn't deal with +or talk about: + +* Registers that are modified (both read and written to, violating SSA) +* Implicit uses of registers +* Maintaining use counts for each CLIF value or virtual register + +Instead, these things should be handled by some combination of +`cranelift/codegen/src/isa//inst.isle` and general Rust code (either in +`cranelift/codegen/src/isa//lower/isle.rs` or elsewhere). + +When an instruction modifies a register, both reading from it and writing to it, +we should build an SSA view of that instruction that gets legalized via "move +mitosis" by splitting a move out from the register. + +For example, on x86 the `add` instruction reads and writes its first operand: + + add a, b == a = a + b + +So we present an SSA facade where `add` operates on three registers, instead of +two, and defines one of them, while reading the other two and leaving them +unmodified: + + add a, b, c == a = b + c + +Then, as an implementation detail of the facade, we emit moves as necessary: + + add a, b, c ==> mov a, b; add b, c + +We call the process of emitting these moves "move mitosis". For ISAs with +ubiquitous use of modified registers and instructions in two-operand form, like +x86, we implement move mitosis with methods on the ISA's `MachInst`. For other +ISAs that are RISCier and where modified registers are pretty rare, such as +aarch64, we implement the handful of move mitosis special cases at the +`inst.isle` layer. Either way, the important thing is that the lowering rules +remain pure. + +Fnally, note that these moves are generally cleaned up by the register +allocator's move coalescing, and move mitosis will eventually go away completely +once we switch over to `regalloc2`, which takes instructions in SSA form +directly as input. + +Instructions that implicitly operate on specific registers, or which require +that certain operands be in certain registers, are handled similarly: the +lowering rules use a pure paradigm that ignores these constraints and has +instructions that explicitly take implicit operands, and we ensure the +constraints are fulfilled a layer below the lowering rules (in `inst.isle` or in +Rust glue code). + +## When are lowering rules allowed to have side effects? + +Extractors (the matchers that appear on the left-hand sides of `rule`s) should +**never** have side effects. When evaluating a rule's extractors, we haven't yet +committed to evaluating that rule's right-hand side. If the extractors performed +side effects, we could get deeply confusing action-at-a-distance bugs where +rules we never fully match pull the rug out from under our feet. + +Anytime you are tempted to perform side effects in an extractor, you should +instead just package up the things you would need in order to perform that side +effect, and then have a separate constructor that takes that package and +performs the side effect it describes. The constructor can only be called inside +a rule's right-hand side, which is only evaluated after we've committed to this +rule, which avoids the action-at-a-distance bugs described earlier. + +For example, loads have a side effect in CLIF: they might trap. Therefore, even +if a loaded value is never used, we will emit code that implements that +load. But if we are compiling for x86 we can sink loads into other the operand +for another operation depending on how the loaded value is used. If we sink that +load into, say, an `add` then we need to tell the lowering context *not* to +lower the CLIF `load` instruction anymore, because its effectively already +lowered as part of lowering the `add` that uses the loaded value. That is a side +effect, and we might be tempted to perform that side effect in the extractor +that matches sinkable loads. But we can't do that because although the load +itself might be sinkable, there might be a reason why we ultimately don't +perform this load-sinking rule, and if that happens we still need to lower the +CLIF load. So we make the `sinkable_load` extractor create a `SinkableLoad` type +that packages up everything we need to know about the load and how to tell the +lowering context that we've sunk it and the lowering context doesn't need to +lower it anymore, but *it doesn't actually tell that to the lowering context +yet*. Then, we pair that with a `sink_load` constructor that takes the +`SinkableLoad`, performs the associated side effect of telling the lowering +context not to lower the load anymore, and returns the x86 operand with the load +sunken into it. See `sinkable_load`, `SinkableLoad`, and `sink_load` inside +`cranelift/codegen/src/isa/x64/inst.isle` for details. + +## ISLE code should leverage types + +ISLE is a typed language, and we should leverage that to prevent whole classes +of bugs where possible. Use newtypes liberally. + +For example, use the `with_flags` family of helpers to pair flags-producing +instructions with flags-consuming instructions, ensuring that no errant +instructions caare ever inserted between our flags-using instructions, +clobbering their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` +inside `cranelift/codegen/src/prelude.isle` for details. From 66432ab4612f859adc4c113da6d6961f3b9f7fe8 Mon Sep 17 00:00:00 2001 From: Nick Fitzgerald Date: Mon, 29 Nov 2021 16:12:45 -0800 Subject: [PATCH 2/2] docs: Clarify isle integration docs --- cranelift/docs/isle-integration.md | 82 +++++++++++++++++++++++------- 1 file changed, 64 insertions(+), 18 deletions(-) diff --git a/cranelift/docs/isle-integration.md b/cranelift/docs/isle-integration.md index 8e4880d3b9..1f7c7154f8 100644 --- a/cranelift/docs/isle-integration.md +++ b/cranelift/docs/isle-integration.md @@ -17,7 +17,7 @@ The build integration is inside of `cranelift/codegen/build.rs`. For regular builds, we check a manifest that records the file hashes of the ISLE source files that went into building a given ISLE-generated Rust file. If the -hashes of these files on disk doesn't match the hashes in the manifest, then the +hashes of these files on disk don't match the hashes in the manifest, then the ISLE-generated Rust file is stale and needs to be rebuilt. In this case, the `build.rs` will report a build error. This way, downstream crates that use Cranelift don't need to build ISLE, and get fewer transitive dependencies and @@ -91,8 +91,8 @@ The lowering rules themselves, defined in `cranelift/codegen/src/isa//lower.isle`, must always be a pure mapping from a CLIF instruction to the target ISA's `MachInst`. -Examples of things that the lowering rules themselves they shouldn't deal with -or talk about: +Examples of things that the lowering rules themselves shouldn't deal with or +talk about: * Registers that are modified (both read and written to, violating SSA) * Implicit uses of registers @@ -128,7 +128,7 @@ aarch64, we implement the handful of move mitosis special cases at the `inst.isle` layer. Either way, the important thing is that the lowering rules remain pure. -Fnally, note that these moves are generally cleaned up by the register +Finally, note that these moves are generally cleaned up by the register allocator's move coalescing, and move mitosis will eventually go away completely once we switch over to `regalloc2`, which takes instructions in SSA form directly as input. @@ -161,20 +161,66 @@ load. But if we are compiling for x86 we can sink loads into other the operand for another operation depending on how the loaded value is used. If we sink that load into, say, an `add` then we need to tell the lowering context *not* to lower the CLIF `load` instruction anymore, because its effectively already -lowered as part of lowering the `add` that uses the loaded value. That is a side -effect, and we might be tempted to perform that side effect in the extractor -that matches sinkable loads. But we can't do that because although the load -itself might be sinkable, there might be a reason why we ultimately don't -perform this load-sinking rule, and if that happens we still need to lower the -CLIF load. So we make the `sinkable_load` extractor create a `SinkableLoad` type +lowered as part of lowering the `add` that uses the loaded value. Marking an +instruction as "already lowered" is a side effect, and we might be tempted to +perform that side effect in the extractor that matches sinkable loads. But we +can't do that because although the load itself might be sinkable, there might be +a reason why we ultimately don't perform this load-sinking rule, and if that +happens we still need to lower the CLIF load. + +Therefore, we make the `sinkable_load` extractor create a `SinkableLoad` type that packages up everything we need to know about the load and how to tell the lowering context that we've sunk it and the lowering context doesn't need to lower it anymore, but *it doesn't actually tell that to the lowering context -yet*. Then, we pair that with a `sink_load` constructor that takes the -`SinkableLoad`, performs the associated side effect of telling the lowering -context not to lower the load anymore, and returns the x86 operand with the load -sunken into it. See `sinkable_load`, `SinkableLoad`, and `sink_load` inside -`cranelift/codegen/src/isa/x64/inst.isle` for details. +yet*. + +```lisp +;; inst.isle + +;; A load that can be sunk into another operation. +(type SinkableLoad extern (enum)) + +;; Extract a `SinkableLoad` from a value if the value is defined by a compatible +;; load. +(decl sinkable_load (SinkableLoad) Value) +(extern extractor sinkable_load sinkable_load) +``` + +Then, we pair that with a `sink_load` constructor that takes the `SinkableLoad`, +performs the associated side effect of telling the lowering context not to lower +the load anymore, and returns the x86 operand with the load sunken into it. + +```lisp +;; inst.isle + +;; Sink a `SinkableLoad` into a `RegMemImm.Mem`. +;; +;; This is a side-effectful operation that notifies the context that the +;; instruction that produced the `SinkableImm` has been sunk into another +;; instruction, and no longer needs to be lowered. +(decl sink_load (SinkableLoad) RegMemImm) +(extern constructor sink_load sink_load) +``` + +Finally, we can use `sinkable_load` and `sink_load` inside lowering rules that +create instructions where an operand is loaded directly from memory: + +``` +;; lower.isle + +(rule (lower (has_type (fits_in_64 ty) + (iadd x (sinkable_load y)))) + (value_reg (add ty + (put_in_reg x) + (sink_load y)))) +``` + +See the `sinkable_load`, `SinkableLoad`, and `sink_load` declarations inside +`cranelift/codegen/src/isa/x64/inst.isle` as well as their external +implementations inside `cranelift/codegen/src/isa/x64/lower/isle.rs` for +details. + +See also the "ISLE code should leverage types" section below. ## ISLE code should leverage types @@ -183,6 +229,6 @@ of bugs where possible. Use newtypes liberally. For example, use the `with_flags` family of helpers to pair flags-producing instructions with flags-consuming instructions, ensuring that no errant -instructions caare ever inserted between our flags-using instructions, -clobbering their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` -inside `cranelift/codegen/src/prelude.isle` for details. +instructions are ever inserted between our flags-using instructions, clobbering +their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` inside +`cranelift/codegen/src/prelude.isle` for details.