From b6bed81ba279d87d8a8ecb6c6066b49a4d75a724 Mon Sep 17 00:00:00 2001 From: Chris Fallin Date: Tue, 23 Nov 2021 16:45:06 -0800 Subject: [PATCH] Add ISLE reference documentation. This documentation provides details for all of the ISLE language features, and detailed rationale for why many of them are designed in the way that they are. It is hopefully both a reasonable tutorial and reference for someone looking to understand the DSL. Note that this documentation is separate from and orthogonal to the work to document the Cranelift bindings and integration work that @fitzgen has covered well in #3556. This document can link to that one and vice-versa once they are both in-tree. --- cranelift/docs/index.md | 4 + cranelift/docs/isle.md | 1306 ++++++++++++++++++++++++++++++++++++++ cranelift/isle/README.md | 13 +- 3 files changed, 1313 insertions(+), 10 deletions(-) create mode 100644 cranelift/docs/isle.md diff --git a/cranelift/docs/index.md b/cranelift/docs/index.md index 62c4e51bd8..9e0f216b66 100644 --- a/cranelift/docs/index.md +++ b/cranelift/docs/index.md @@ -13,6 +13,10 @@ - [Cranelift's register allocator](regalloc.md) This page document Cranelift's current register allocator. + + - [ISLE](isle.md) + This page documents the domain-specific language (DSL), ISLE, that + we use to define instruction-lowering patterns. ## Cranelift crate documentation: diff --git a/cranelift/docs/isle.md b/cranelift/docs/isle.md new file mode 100644 index 0000000000..565ccc4e3c --- /dev/null +++ b/cranelift/docs/isle.md @@ -0,0 +1,1306 @@ +# ISLE: Instruction Selection Lowering Expressions + +This document will describe ISLE (Instruction Set Lowering +Expressions), a DSL (domain-specific language) that we have developed +in order to help us express certain parts of the Cranelift compiler +backend more naturally. ISLE was first [described in RFC +#15](https://github.com/bytecodealliance/rfcs/pull/15) and now is used +by and lives in the Cranelift tree in +[cranelift/isle](https://github.com/bytecodealliance/wasmtime/tree/main/cranelift/isle). + +## Intro and Whirlwind Tour: DSL for Instruction Lowering + +The goal of ISLE is to represent *instruction lowering patterns*. An +instruction lowering pattern is a specification that a certain +combination of operators in the IR (CLIF), when combined under certain +conditions, can be compiled down into a certain sequence of machine +instructions. For example: + +- An `iadd` (integer add) operator can always be lowered to an x86 + `ADD` instruction with two register sources. + +- An `iadd` operator with one `iconst` (integer-constant) argument can + be lowered to an x86 `ADD` instruction with a register and an + immediate. + +One could write something like the following in ISLE (simplified from +the real code [here](../codegen/src/isa/x64/lower.isle): + +```lisp +;; Add two registers. +(rule (lower (iadd x y)) + (value_reg (add + (put_in_reg x) + (RegMemImm.Reg (put_in_reg y))))) + +;; Add a register and an immediate. +(rule (lower (iadd x (simm32_from_value y)) + (value_reg (add + (put_in_reg x) + ;; `y` is a `RegMemImm.Imm`. + y))) +``` + +ISLE lets the compiler backend developer express this information in a +declarative way -- i.e., just write down a list of patterns, without +worrying how the compilation process tries them out -- and the ISLE +DSL compiler will convert this list of patterns into efficient Rust +code that becomes part of Cranelift. + +The rest of this document will describe the semantics of the DSL +itself. ISLE has been designed to be a general-purpose DSL that can +apply to any sort of backtracking pattern-matching problem, and will +generate a decision tree in Rust that can call into arbitrary +interface code. + +Separate documentation will describe how we have written *bindings* +and *helpers* to allow ISLE to be specifically used to write Cranelift +lowering patterns like the above. (TODO: link this documentation) + +## Outline of This Document + +This document is organized into the following sections: + +* Term-Rewriting Systems: a general overview of how term-rewriting + systems work, how to think about nested terms, patterns and rewrite + rules, how they provide a general mechanism for computation, and how + term-rewriting is often used in a compiler-implementation context. + +* Core ISLE: the foundational concepts of the ISLE DSL, building upon + a general-purpose term-rewriting base. Covers the type system (typed + terms) and how rules are written. + +* ISLE with Rust: covers how ISLE provides an "FFI" (foreign function + interface) of sorts to allow interactino with Rust code, and + describes the scheme by which ISLE execution is mapped onto Rust + (data structures and control flow).[^1] + +* ISLE Internals: describes how the ISLE compiler works. Provides + insight into how an unordered collection of rewrite rules are + combined into executable Rust code that efficiently traverses the + input and matches on it. + +[^1]: One might call this the BRIDGE (Basic Rust Interface Designed + for Good Efficiency) to the ISLE, but unfortunately we missed the + chance to introduce that backronym when we wrote the initial + implementation. + +## Term-Rewriting Systems + +A [term-rewriting +system](https://en.wikipedia.org/wiki/Rewriting#Term_rewriting_systems), or TRS, +is a system that works by representing data as *terms* and then +applying *rules* to "rewrite" the terms until no more rules are +applicable, at which point the resulting term is the system's output. + +Term-rewriting systems are a general kind of computing system, at the +same level as (e.g.) Turing machines or other abstract computing +machines. Term-rewriting is actually Turing-complete, or in other +words, can express any program, if no limits are placed on term length +or recursion.[^2] + +[^2]: In fact, the [lambda + calculus](https://en.wikipedia.org/wiki/Lambda_calculus) + introduced by Alonzo Church is actually a term-rewriting system + and was developed at the same time as Turing's concepts of + universal computation! + +Why might one want to use a TRS rather than some other, more +conventional, way of computing an answer? One reason is that they are +highly applicable to *pattern-matching* problems: for example, +translating data in one domain to data in another domain, where the +translation consists of a bunch of specific equivalences. This is part +of why term-rewriting is so interesting in the compiler domain: +compiler backends work to lower certain patterns in the program (e.g.: +a multiply-add combination) into instructions that the target machine +provides (e.g.: a dedicated multiply-add instruction). + +Term rewriting as a process also naturally handles issues of +*priority*, i.e. applying a more specific rule before a less specific +one. This is because the abstraction allows for multiple rules to be +"applicable", and so there is a natural place to reason about priority +when we choose which rule to apply. + +Additionally, term rewriting allows for a sort of *modularity* that is +not present in hand-written pattern-matching code: the specific rules +can be specified in any order, and the term-rewriting engine "weaves" +them together so that in any state, when we have partially matched the +input and are narrowing down which rule will apply, we consider all +the related rules at once. Said another way: hand-written code tends +to accumulate a lot of nested conditionals and switch/match +statements, i.e., resembles a very large decision tree, while +term-rewriting code tends to resemble a flat list of simple patterns +that, when composed and combined, become that more complex tree. + +### Data: Nested Trees of Constructors + +A term-rewriting system typically operates on data that is in a *tree* +form, or at least can be interpreted that way.[^3] + +[^3]: In the most fundamental and mathematical sense, a TRS just + operates on a sequence of symbols, but we can talk about + structure that is present in those symbols in any well-formed + sequence. For example, we can define a TRS that only operates on + terms with balanced parentheses; then we have our tree. + +In ISLE and hence in this document, we operate on terms that are +written in an +[S-expression](https://en.wikipedia.org/wiki/S-expression) syntax, +borrowed from the Lisp world. So we might have a term: + +```lisp + (a (b c 1 2) (d) (e 3 4)) +``` + +which we can write more clearly as the tree: + +```lisp + (a + (b + c 1 2) + (d) + (e + 3 4)) +``` + +Each term consists of either a *constructor* (which looks like a +function call to Lisp-trained eyes) or a *primitive*. In the above, +the `(a ...)`, `(b ...)`, `(d)`, and `(e ...)` terms/subterms are +constructor invocations. A constructor takes some number of arguments +(its *arity*), each of which is itself a term. Primitives can be +things like integer, string, symbolic, or boolean constants, or +variable names. + +Some term-rewriting systems have other syntax conventions: for +example, systems based on +[Prolog](https://en.wikipedia.org/wiki/Prolog) tend to write terms +like `a(b(c, 1, 2), d, e(3, 4))`, i.e., with the name of the term on +the outside of the parentheses. This is just a cosmetic difference to +the above, but we note it to make clear that the term structure is +important, not the syntax. + +It may not be immediately clear how to use this data representation, +but we can give a small flavor here: if one defines *constructors* for +each instruction or operator in a compiler's intermediate +representation (IR), one can start to write expressions from that IR +as terms; for example: + +```lisp + v1 = imul y, z + v2 = iadd x, v1 +``` + +could become: + +```lisp + (iadd x (imul y z)) +``` + +This will become much more useful once we have rewrite rules to +perform transformations on the terms! + +Representing an IR is, of course, just one possible use of term data +(albeit the original "MVP" that guided ISLE's design); there are many +others, too. Interested readers are encouraged to read more on, e.g., +[Prolog](https://en.wikipedia.org/wiki/Prolog), which has been used to +represent logical predicates, "facts" in expert systems, symbolic +mathematical terms, and more. + +### Rules: Left-hand-side Patterns, Right-hand-side Expressions + +The heart of a term-rewriting system is in the set of *rules* that +actually perform the rewrites. The "program" itself, in a +term-rewriting DSL, consists simply of an unordered list of +rules. Each rule may or may not apply; if it applies, then it can be +used to edit the term. Execution consists of repeated application of +rules until no more rules can be applied. + +A rule consists of two parts: the left-hand side (LHS), or *pattern*, +and right-hand side (RHS), or *expression*. The left-hand and +right-hand nomenclature comes from a common way of writing rules as: + +```plain + A -> B ;; any term "A" is rewritten to "B" + + (A x) -> (B (C x)) ;; any term (A x), for some x, is rewritten to (B (C x)). + + (A _) -> (D) ;; any term (A _), where `_` is a wildcard (any subterm), + ;; is rewritten to (D). +``` + +#### Left-hand Sides: Patterns + +Each left-hand side is written in a pattern language that commonly has +a few different kinds of "matchers", or operators that can match +subterms: + +* `(A pat1 pat2 ...)` matches a constructor `A` with patterms for each + of its arguments. + +* `x` matches any subterm and captures its value in a variable + binding, which can be used later when we specify the right-hand side + (so that the rewrite contains parts of the original term). + +* `_` is a wildcard and matches anything, without capturing it. + +* Primitive constant values, such as `42` or `$Symbol`, match only if + the term is exactly equal to this constnat. + +These pattern-matching operators can be combined, so we could write, +for example, `(A (B x _) z)`. This pattern would match the term `(A (B +1 2) 3)` but not `(A (C 4 5) 6)`. + +A pattern can properly be seen as a partial function from input term +to captured (bound) variables: it either matches or it doesn't, and if +it does, it provides specific term values for each variable binding +that can be used by the right-hand side. + +A fully-featured term rewriting system usually has other operators as +well, for convenience: for example, "match already-captured value", or +"bind variable to subterm and also match it with subpattern", or +"match subterm with all of these subpatterns". But even the above is +powerful enough for Turing-complete term reduction, surprisingly; the +essence of term-rewriting's power is just its ability to trigger +different rules on different "shapes" of the tree of constructors in +the input, and on special cases for certain argument values. + +Pattern-based term rewriting has a notable and important feature: it +typically allows *overlapping* rules. This means that more than one +pattern might match on the input. For example, the two rules: + +```plain + (A (B x)) -> (C x) + (A _) -> (D) +``` + +could *both* apply to an input term `(A (B 1))`. The first rule would +rewrite this input to `(C 1)`, and the second rule would rewrite it to +`(D)`. Either rewrite would be an acceptable execution step under the +base semantics of most term-rewriting systems; ordinarily, the +*correctness* of the rewrite should not depend on which rule is +chosen, only possibly the "optimality" of the output (whatever that +means for the application domain in question) or the number of rewrite +steps to get there. + +However, in order to provide a deterministic answer, the system must +somehow specify which rule will be applied in such a situation based +on precedence, or specificity, or some other tie-breaker. A common +heuristic is "more specific rule wins". We will see how ISLE resolves +this question below by using both this heuristic and an explicit +priority mechanism. + +#### Right-hand Sides: Rewrite Expressions + +Given a rule whose pattern has matched, we now need to compute the +rewritten term that replaces the original input term. This rewrite is +specified by the right-hand side (RHS), which consists of an +*expression* that generates a new term. This expression can use parts +of the input term that have been captured by variables in the +pattern. + +We have already seen a few examples of this above: simple term +expressions, with variables used in place of concrete subterms where +desired. A typical term-rewrite system allows just a few options in +the output expression: + +* Terms, with sub-expressions as arguments; +* Constant primitives (`42` or `$Symbol`); and +* Captured variable values (`x`). + +The options are more limited in expressions than in patterns (e.g., +there are no wildcards) because a pattern is matching on a range of +possible terms while an expression must specify a specific rewrite +result. + +### Rewrite Steps and Intermediate Terms + +Now that we can specify rewrites via a list of rules, we can study how +the top-level execution of a term-rewriting system proceeds. Much of +the power of term-rewriting comes from the fact that rewrites can +*chain together* into a multi-step traversal through several +intermediate terms before the final answer is computed. + +For a simple example, consider the following rules: + +```plain + (A (B x)) -> (C x) + (C (D x)) -> (E x) + (C (F x)) -> (G x) +``` + +This set of rules will rewrite `(A (B (D 42)))` to `(C (D 42))`, then +to `(E 42)` (via the first and third rules respectively). + +How is this useful? First, rewriting one term to another (here, `C` at +the top level) that in turn appears in the left-hand side of other +rules allows us to *factor* a "program" of term-rewriting rules in the +same way that imperative programs are factored into separate +functions.[^4] The usual advantages of a well-factored program, where +each problem is solved with a small step that "reduces to a previously +solved problem", apply here. + +Second, repeating the rewrite step is actually what grants +term-rewriting its Turing-completeness: it allows for arbitrary +control flow.[^5] This might be useful in cases where, for example, a +term-rewriting program needs "loop" over a list of elements in the +input: it can recurse and use intermediate terms to store state. + +While this full generality may not be used often in the +domain-specific applications of term-rewriting that emphasize its +pattern-matching (such as instruction selectors), the user should not +be afraid to define and use intermediate terms -- rewriting into them, +then defining additional rules to rewrite further -- when it helps to +factor common behavior out of multiple rules, or aids in conceptual +clarity. + +[^4]: In fact, ISLE actually compiles rules for different top-level + pattern terms (`(A ...)` and `(C ...)` in the example) into + separate Rust functions, so factoring rules to use intermediate + terms can provide code-size and compile-time benefits for the + ISLE-generated Rust code as well. + +[^5]: The [lambda calculus' reduction + rules](https://en.wikipedia.org/wiki/Lambda_calculus#Reduction) + are a good example of this. + +### Application to Compilers: A Term is a Value; Rewrites are Lowerings + +So far this has been a fairly abstract introduction to term-rewriting +systems as a general computing paradigm. How does this relate (or, how +is it commonly mapped) to the instruction-selection problem? + +In a domain such as instruction selection, we manipulate terms that +represent computations described by an IR, and the terms are +eventually rewritten into terms that name specific machine +instructions. We can think of each term as having a denotational value +that that *is* that program value. Then, any rewrite is correct if it +preserves the denotational value of the term. + +In other words, terms are just values, and rules specify alternative +ways to compute the same values. We might have rewrite rules that +correspond to common algebraic identities (`a + b` == `b + a`, and +`a + 0` == `a`), for example. The main sort of rewrite rule, however, +will be one that takes a machine-*independent* operator term and +rewrites it into a machine-*dependent* instruction term. For example: + +```plain + (iadd a b) -> (isa.add_reg_reg a b) + + (iadd a (iconst 0)) -> a + + (iadd a (iconst n)) (isa.add_reg_imm a n) +``` + +These rules specify three ways to convert an IR `iadd` operator into +an ISA-specific instruction. Recall from above that in general, an +application of a term-rewriting system should not depend for +correctness on the order or choice of rule application: when multiple +rules are applicable, then any sequence of rewrites that ends in a +terminating state (a state with no further-applicable rules) should be +considered a "correct" answer.[^6] Here, this is true: if, for +example, we choose the register-plus-register form (the first rule) +for an `iadd` operation, but the second argument is actually an +`iconst`, then that is still valid, and the `iconst` will separately +be rewritten by some other rule that generates a constant into a +register. It simply may not be as efficient as the more specific third +rule (or second rule, if the constant is zero). Hence, rule ordering +and prioritization is nevertheless important for the quality of the +instruction selector. + +[^6]: Note that this suggests an interesting testing strategy: we + could choose arbitrary (random) orders of lowering rules to + apply, or even deliberately worst-case orders according to some + heuristic. If we can differentially test programs compiled with + such randomized lowerings against "normally" compiled programs + and show that the results are always the same, then we have + shown that are lowering rules are "internally consistent", + without any other external oracle. This will have a similar + effect to + [wasm-mutate](https://github.com/bytecodealliance/wasm-tools/tree/main/crates/wasm-mutate), + but takes mutations implicitly from the pool of rules rather + than a separate externally-defined pool of mutations. This idea + remains future work. + +## Core ISLE: a Term-Rewriting System + +This section describes the core ISLE language. ISLE's core is a +term-rewriting system, with a design that very closely follows the +generic concepts that we have described above. + +In the core language, ISLE's key departure from many other +term-rewriting systems is that it is *strongly typed*. A classical +term-rewriting system, especially one designed for instruction +rewriting, will typically have just one type of term, corresponding to +a "value" in the program. In contrast, ISLE is designed so that terms +can represent various concepts in a compiler backend: values, but also +machine instructions or parts of those instructions ("integer +immediate value encodable in machine's particular format"), or +abstract bundles of information with invariants or guarantees encoded +in the type system ("load that I can sink", "instruciton that produces +flags"). + +ISLE's other key departure from many other systems is its first-class +integration with Rust, including a well-defined "FFI" mapping that +allows ISLE rules to call into Rust in both their patterns and +expressions, and to operate directly on types that are defined in the +surrounding Rust code. This allows for easy and direct embedding into +an existing compiler backend. We will cover this aspect more in the +next section, [ISLE to Rust](#isle-to-rust). + +### Rules + +ISLE allows the user to specify rewrite rules, with a syntax similar +in spirit to that shown above: + +```lisp + (rule + ;; left-hand side (pattern): if the input matches this ... + (A (B _ x) (C y)) + ;; ... then rewrite to this: + (D x y)) +``` + +The pattern (left-hand side) is made up of the following match +operators: + +* Wildcards (`_`). +* Integer constants (decimal/hex, positive/negative: `1`, `-1`, + `0x80`, `-0x80`). +* Symbolic constants (`$Symbol`). +* Variable captures (bare identifiers like `x`; an identifier consists + of alphanumeric characters and underscores, and does not start with + a digit). +* Variable captures with sub-patterns: `x @ PAT`, which captures the + subterm in `x` as above but also matches `PAT` against the subterm. +* "Equal-variable" constraints (`=x`): a subterm must match an + already-captured value. +* conjunctions of subpatterns: `(and PAT1 PAT2 ...)` matches all of + the subpatterns against the term. If any subpattern does not match, + then this matcher fails. +* Term deconstruction: `(term PAT1 PAT2 ...)`, where `term` is a + defined term (type variant or constructor) and the subpatterns are + applied to each argument value in turn. + +The expression (right-hand side) is made up of the following +expression operators: + +* Integer and symbolic constants, as above. +* Boolean constants `#t` and `#f` (following Scheme syntax). +* Variable uses (bare `x` identifier). +* Term constructors (`(term EXPR1 EXPR2 ...)`, where each + subexpression is evaluated first and then the term is constructed). +* `let`-expressions that bind new variables, possibly using the values + multiple times within the body: `(let ((var1 type1 EXPR1) (var2 ...) + ...) BODY ...)`. Each variable's initialization expression can refer + to the immediately previous variable bindings (i.e., this is like a + `let*` in Scheme). `let`s are lexically-scoped, meaning that bound + variables are available only within the body of the `let`. + +When multiple rules are applicable to rewrite a particular term, ISLE +will choose the "more specific" rule according to a particular +heuristic: in the lowered sequence of matching steps, when one +left-hand side completes a match while another with the same prefix +continues with further steps, the latter (more specific) is run first. + +The more-specific-first heuristic is usually good enough, but when an +undesirable choice occurs, explicit priorities can be specified. +Rules with explicit priorities are written as `(rule PRIO lhs rhs)` +where `PRIO` is a signed (positive or negative) integer. An applicable +rule with a higher priority will always match before a rule with a +lower priority. The default priority for all rules if not otherwise +specified is `0`. + +### Typed Terms + +ISLE allows the programmer to define types, and requires every term to +have *parameter types* and a *return type* (analogous to first-order +functions). + +The universe of types is very simple: there are *primitives*, which +can be integers or symbolic constants (imported from the Rust +embedding), and *enums*, which correspond direclty to Rust enums with +variants that have named fields. There is no subtyping. Some examples +of type definitions are: + +```lisp + + (type u32 (primitive u32)) ;; u32 is a primitive, and is + ;; spelled `u32` in the generated Rust code. + + (type MyType (enum + (A (x u32) (y u32)) + (B (z u32)) + (C))) ;; MyType is an enum, with variants + ;; `MyType::A { x, y }`, `MyType::B { z }`, + ;; and `MyType::C`. + + (type MyType2 extern (enum (A))) + ;; MyType2 is an enum with variant `MyType2::A`. + ;; Its type definition is not included in the + ;; generated Rust, but rather, assumed to exist + ;; in surrounding code. Useful for binding to + ;; existing definitions. +``` + +We then declare constructors with their parameter and return types as +follows: + +```lisp + + (decl Term1 (u32 u32) MyType) ;; Term1 has two `u32`-typed parameters, + ;; and itself has type `MyType`. + (decl Term2 () u32) ;; Term2 has no parameters and type `u32`. +``` + +Note that when an enum type is defined, its variants are implicitly +defined as constructors as well. These constructors are namespaced +under the name of the type, to avoid ambiguity (or the need to do +type-dependent lookups in the compiler, which can complicate type +inference). For example, given the above `MyType` definitions, we +automatically have the following constructors: + +```lisp + + ;; These definitions are implicit and do not need to be written (doing + ;; so is a compile-time error, actually). We write them here just to + ;; show what they would look like. + + (decl MyType.A (u32 u32) MyType) + (decl MyType.B (u32) MyType) + (decl MyType.C () MyType) + + (decl MyType2.A () MyType2) +``` + +### Why Types? + +For terms that are not enum variants, the notion that a term "has a +type" is somewhat foreign to a classical term-rewriting system. In +most formal symbolic systems, the terms are manipulated as opaque +sequences or trees of symbols; they have no inherent meaning other +than what the user implicitly defines with the given rules. What does +it mean for a term to "have a type" when it is just data? Or, said +another way: why isn't the type of `Term2` just `Term2`? + +The types come into play when we define *rules*: one term of type `T` +can only be rewritten into another term of type `T`, and when a +parameter has a certain type, only subterms with that type can +appear. Without explicit types on terms and their parameters, any term +could be rewritten to any other, or substituted in as a parameter, and +there is thus a kind of dynamic typing about which the programmer must +have some caution. In most applications of a term-rewriting system, +there is already some de-facto "schema": some parameter of a term +representing a machine instruction can only take on one of a few +subterms (representing, say, different addressing modes). ISLE's types +just make this explicit. + +Thus, the first answer to "why types" is that they enforce a schema on +the terms, allowing the programmer to have stronger well-formed-data +invariants. + +The second reason is that the types are an integral part of the +compilation-to-Rust strategy: every constructor actually does evaluate +to a Rust value of the given "return value" type, given actual Rust +values for its parameters of the appropriate parameter types. We will +see more on this below. + +### Well-Typed Rules and Type Inference + +Now that we understand how to define types, let's examine in more +detail how they are used to verify that the pattern and rewrite +expression of a rule have the same type. + +ISLE uses a simple unidirectional type-inference algorithm that +propagates type information through the pattern, resulting in a "type +environment" that specifies the type for each captured variable, and +then uses this to typecheck the rewrite expression. The result of this +is that types are almost completely inferred, and are only annotated +in a few places (`let` bindings specifically). + +The typing rules for patterns in ISLE are: + +* At the root of the pattern, we require that a *constructor* pattern + is used, rather than some other match operation (a wildcard, integer + constant, etc.). This is because compilation and dispatch into rules + is organized by the top-level constructor of the term being + rewritten. + +* At each part of the pattern except the root, there is an "expected + type" that is inferred from the surrounding context. We check that + this matches the actual type of the pattern. + +* A constructor pattern `(C x y z)`, given a constructor `(decl C (T1 + T2 T2) R)`, has type `R` and provides expected types `T1`, `T2`, and + `T3` to its subpatterns. + +* A variable capture pattern `x` is compatible with any expected type, + and captures this expected type under the variable identifier `x` in + the type environment. + +* A variable-equality pattern `=x` checks that the expected type is + equal to the already-captured type for `x` in the type environment. + +* A conjunction `(and PAT1 PAT2 ...)` checks that each subpattern is + compatible with the expected type. + +* Integer constants are compatible with any primitive expected + type. (This may change in the future if we add non-numeric + primitives, such as strings.) + +If we are able to typecheck the pattern, we have a type environment +that is a map from variable bindings to types: e.g., `{ x: MyType, y: +MyType2, z: u32 }`. We then typecheck the rewrite expression. + +* Every expression also has an expected type, from the surrounding + context. We check that the provided expression matches this type. + +* The top-level rewrite expression must have the same type as the + top-level constructor in the pattern. (In other words, a term can + only be rewritten to another term of the same type.) + +* Constructors check their return values against the expected type, + and typecheck their argument expressions against their parameter + types. + +* A `let` expression provides types for additional variable bindings; + these are added to the type environment while typechecking the + body. The expected type for the body is the same as the expected + type for the `let` itself. + +### A Note on Heterogeneous Types + +We should illuminate one particular aspect of the ISLE type system +that we described above. We have said that a term must be rewritten to +another term of the same type. Note that this does *not* mean that, +for example, a set of ISLE rules cannot be used to translate a term of +type `T1` to a term of type `T2`. The trick is to define a top-level +"driver" that wraps the `T1`, such that reducing this term results in +a `T2`. Concretely: + +```lisp + (type T1 ...) + (type T2 ...) + + (decl Translate (T1) T2) + + (rule (Translate (T1.A ...) + (T2.X ...))) + (rule (Translate (T1.B ...) + (T2.Y ...))) +``` + +This gets to the heart of rewrite-system-based computation, and has +relevance for applications of ISLE to compiler backends. A common +technique in rewrite systems is to "kick off" a computation by +wrapping a term in some intermediate term that then drives a series of +reductions. Here we are using `Translate` as this top-level term. A +difference between ISLE and some other rewrite-based instruction +selectors is that rewrites are always driven by term reduction from +such a toplevel term, rather than a series of equivalences directly +from IR instruction to machine instruction forms. + +In other words, a conventional isel pattern engine might let one +specify `(Inst.A ...) -> (Inst.X ...)`. Here the instruction/opcode +type on the LHS and RHS must be the same single instruction type +(otherwise rewrites could not be chained), and rewrite relation (which +we wrote as `->`) is in essence a single privileged relation. One can +see ISLE as a generalization: we can define many different types, and +many different toplevel terms from which we can start the +reduction. In principle, one could have: + +```lisp + + (type IR ...) + (type Machine1 ...) + (type Machine2 ...) + + (decl TranslateToMachine1 (IR) Machine1) + (decl TranslateToMachine2 (IR) Machine2) + + (rule (TranslateToMachine1 (IR.add a b) (Machine1.add a b))) + (rule (TranslateToMachine2 (IR.add a b) (Machine2.weird_inst a b))) +``` + +and then both translations are available. We are "rewriting" from `IR` +to `Machine1` and from `IR` to `Machine2`, even if rewrites always +preserve the same type; we get aorund the rule by using a constructor. + +### Constructors and Extractors + +So far, we have spoken of terms and constructors: a term is a schema +for data, like `(A arg1 arg2)`, while we have used the term +"constructor" to refer to the `A`, like a function. We now refine this +notion somewhat and define what it means for a term to appear in the +left-hand (pattern) or right-hand (expression) side of a rule. + +More precisely, a term, like `A`, can have three kinds of behavior +associated with it: it can be an enum type variant, it can be a +constructor, or it can be an *extractor*, which we will define in a +moment. A term can be both an extractor and constructor +simultaneously, but the enum type variant case is mutually exclusive +with the others. + +The distinction between a "constructor" and an "extractor" is whether +a term is being deconstructed (matched on) -- by an extractor -- or +constructed -- by a constructor. + +#### Constructors + +Constructor behavior on a term allows it to be invoked in the +right-hand side of a rule. A term can have either an "external +constructor" (see below) or an "internal constructor", defined in +ISLE. Any term `A` that has one or more `(rule (A ...) RHS)` rules in +the ISLE source implicitly has an internal constructor, and this +constructor can be invoked from the right-hand side of other rules. + +#### Extractors + +Extractor behavior on a term allows it to be used in a pattern in the +left-hand side of a rule. If one considers a constructor to be a +function that goes from argument values to the complete term, then an +extractor is a function that takes a complete term and possibly +matches on it (it is fallible). If it does match, it provides the +arguments *as results*. + +One can see extractors as "programmable match operators". They are a +generalization of enum-variant deconstruction. Where a traditional +term-rewriting system operates on a term data-structure that exists in +memory, and can discover that a pattern `(A x y)` matches a term `A` +at a particular point in the input, an extractor-based system instead +sees `A` as an *arbitrary programmable operator* that is invoked +wherever a pattern-match is attempted, and can return "success" with +the resulting "fields" as if it were actually an enum variant. For +more on this topic, see the motivation and description in [RFC 15 +under "programmable matching on virtual +nodes"](https://github.com/bytecodealliance/rfcs/blob/main/accepted/cranelift-isel-isle-peepmatic.md#extractors-programmable-matching-on-virtual-nodes). + +To provide a concrete example, if we have the term declarations + +```lisp + (decl A (u32 u32) T) + (decl B (T) U) +``` + +then if we write a rule like + +```lisp + (rule (B (A x y)) + (U.Variant1 x y)) +``` + +then we have used `A` as an *extractor*. When `B` is invoked as a +constructor with some `T`, this rule uses `A` as an extractor and +attempts (via whatever programmable matching behavior) to use `A` to +turn the `T` into two `u32`s, binding `x` and `y`. `A` can succeed or +fail, just as any other part of a pattern-match can. + +Just as for constructors, there are *internal* and *external* +extractors. Most of the interesting programmable behavior occurs in +external extractors, which are defined in Rust; we will discuss this +further in a section below. Internal extractors, in contrast, behave +like macros, and can be defined for convenience: for example, we can +write + +```lisp + (decl A (u32 u32) T) + (extractor (A pat1 pat2) + (and + (extractArg1 pat1) + (extractArg2 pat2))) +``` + +which will, for example, expand a pattern `(A (subterm ...) _)` into +`(and (extractArg1 (subterm ...)) (extractArg2 _))`. + +#### Summary: Terms, Constructors, and Extractors + +We start with a `term`, which is just a schema for data: + +```lisp + (decl Term (A B C u32 u32) T) +``` + +A term can have: + +1. A single internal extractor body, via a toplevel `(extractor ...)` + form, OR + +2. A single external extractor binding, via an `(extern extractor + ...)` form, defined below; AND + +3. One or more `(rule (Term ...) ...)` toplevel forms, which together + make up an internal constructor definition, OR + +4. A single external constructor binding, via an `(extern constructor + ...)` form, defined below. + +## ISLE to Rust + +Now that we have described the core ISLE language, we will document +how it interacts with Rust code. We consider these interactions to be +semantically as important as the core language: they are not +implementation details, but rather, a well-defined interface by which +ISLE can interface with the outside world (an "FFI" of sorts). + +### Mapping to Rust: Constructors, Functions, and Control Flow + +ISLE was designed to have a simple, easy-to-understand mapping from +its language semantics to Rust semantics. This means that the +execution of ISLE rewriting has a well-defined implementation in +Rust. The basic principles are: + +1. Every constructor with rules in ISLE becomes a Rust function. The + constructor arguments are the Rust function arguments. The + constructor's "return value" is the Rust function's return value + (usually wrapped in an `Option` because pattern coverage can be + incomplete). + +2. One rewrite step is one Rust function call. + +3. Rewriting is thus eager, and reified through ordinary Rust control + flow. The code that embeds the ISLE generated code will kick off + execution by calling a top-level "driver" constructor; this will + invoke constructors in its rewrite expression, making other Rust + function calls, until eventually a value is returned. + +4. This design means that "intermediate terms" -- constructed terms + that are then further rewritten -- are never actually built as + in-memory data-structures. Rather, they exist only as ephemeral + stack-frames while the corresponding Rust function executes. This + means that there is very little or no performance penalty to + factoring code into many sub-rules (subject only to function-call + overhead and/or the effectiveness of the Rust inliner). + +5. Backtracking -- attempting to match rules, and backing up to follow + a different path when a match fails -- is entirely internal to the + generated Rust function for a constructor. Once we are rewriting a + constructor and its subterms, we have committed to that + constructor. Internally to the rewrite-evaluation of one + constructor, it evaluates left-hand side patterns, finding one that + matches, and then commits to that and starts to invoke constructors + to build the rewritten right-hand side. + + Said another way, the principle is that left-hand sides are + fallible, and have no side-effects as they execute; right-hand + sides, in contrast, are infallible. This simplifies the control + flow and makes reasoning about side-effects (especially with + respect to external Rust actions) easier. + +This will become more clear as we look at how Rust interfaces are +defined, and how the generated code appears, below. + +### Extern Constructors and Extractors + +ISLE programs interact with the surrounding Rust code in which they +are embedded by allowing the programmer to define a term to have +*external constructors* and an *external extractor*. + +The design philosophy of ISLE is that while internally it operates as +a fairly standard term-rewriting system, on the boundaries the "terms" +should be virtual, and defined procedurally rather than reified into +data structures, in order to allow for very flexible binding to the +embedding application. Thus, when term-rewriting bottoms out on the +input side, it just calls "extractors" to match on whatever ultimate +input the user provides, and these are fully programmable; and when it +bottoms out on the output side, the "term tree" is reified as a tree +of Rust function calls rather than plain data. + +#### Constructors + +As we defined above, a "constructor" is a term form that appears in an +expression and builds its return value from its argument +values. During the rewriting process, a constructor that can trigger +further rewriting rules results in a Rust function call to the body of +the "internal constructor" built from these rules; the term thus never +exists except as argument values on the stack. However, ultimately the +ISLE code needs to return some result to the outside world, and this +result may be built up of many parts; this is where *external +constructors* come into play. + +For any term declared like + +```lisp + (decl T (A B C) U) +``` + +the programmer can declare + +```lisp + (extern constructor T ctor_func) +``` + +which means that there is a Rust function `ctor_func` on the context +trait (see below) that can be *invoked* with arguments of type `A`, +`B`, `C` (actually borrows `&A`, `&B`, `&C`, for non-primitive types) +and returns a `U`. + +External constructors are infallible: that is, they must succeed, and +always return their return type. In contrast, internal constructors +are fallible because they are implemented by a list of rules whose +patterns may not cover the entire domain. If fallible behavior is +needed when invoking external Rust code, that behavior should occur in +an extractor (see below) instead: only pattern left-hand sides are +meant to be fallible. Note, incidentally, that fallibility in a +constructor does not backtrack to try other rules; instead it just +trickles the `None` failure all the way up. + +#### Extractors + +An *external extractor* is an implementation of matching behavior in +left-hand sides (patterns) that is fully programmable to interface +with the embedding application. When the generated pattern-matching +code is attempting to match a rule, and has a value to match against +an extractor pattern defined as an external extractor, it simply calls +a Rust function with the value of the term to be deconstructed, and +receives an `Option<(arg1, arg2, ...)>` in return. In other words, the +external extractor can choose to match or not, and if it does, it +provides the values that are in turn matched by sub-patterns. + +For any term declared like + +```lisp + (decl T (A B C) U)` +``` + +the programmer can declare + +```lisp + (extern extractor T etor_func) +``` + +which means that there is a Rust function `etor_func` on the context +trait (see below) that can be *invoked* with an argument of type `&U`, +and returns an `Option<(A, B, C)>`. + +If an extractor returns `None`, then the generated matching code +proceeds just as if an enum variant match had failed: it moves on to +try the next rule in turn. + +#### Advanced Extractors: Arg-Polarity + +There is one shortcoming in the extractor mechanism as defined so far: +there is no mechanism to provide additional context or "parameterize" +an external extractor; it only receives the term to deconstruct, with +no other parameters. For example, one might wish to write a +`(GetNthArg n arg_pattern)` extractor that matches on an instruction, +fetches the `n`th "arg" from it, and then returns that as the +extracted match result, allowing `arg_pattern` to match against it. + +Inspired by Prolog, where argument data can flow in both directions +during "unification", we implement a limited form of bidirectionality +in ISLE for extractor arguments. Specifically, we can declare the +"polarity" of the arguments to an extractor so that some of the +arguments flow "in" rather than "out". This lets us provide data to +the extractor via expressions embedded in the pattern. + +An example might make this more clear: we can define a term + +```lisp + (decl GetNthArg (u32 Arg) Inst) +``` + +that we wish to use as in the example given above, and then we can define +an external extractor + +```lisp + (extern extractor GetNthArg get_nth_arg (in out)) +``` + +which indicates that `get_nth_arg()` will take parameters of `&Inst` +(the value being extracted) *and* `u32` (the in-arg), and return +`Option<(Arg,)>`, i.e., *just* the out-args. + +In order to use this extractor, to avoid parse ambiguity (i.e., to +avoid the need for the parser to know argument polarity while it is +still parsing), we require special syntax for the in-argument so that +it is parsed as an expression: it must be prepended by `<`. So one +might use `GetNthArg` as follows: + +```lisp + (rule (Lower (and + (Inst ...) + (GetNthArg <2 second_arg))) + ...) +``` + +This functionality is meant to improve the expressive power of the +DSL, but is not intended to be commonly used outside of "binding" or +"library" ISLE code. In other words, because it is somewhat +non-intuitive, it is best to wrap it within friendlier internal +extractors. + +### Mapping Type Declarations to Rust + +When we declare a type like + +```lisp + (decl MyEnum (enum + (A (x u32) (y u32)) + (B))) +``` + +ISLE will generate the Rust type definition + +```rust +#[derive(Clone, Debug)] +pub enum MyEnum { + A { x: u32, y: u32, }, + B, +} +``` + +Note that enum variants with no fields take on the brace-less form, +while those with fields use the named-struct-field `A { x: ... }` +form. If all variants are field-less, then the type will additionally +derive `Copy`, `PartialEq`, and `Eq`. + +If the type is declared as extern (`(decl MyEnum extern (enum ...))`) +then the same definition is assumed to exist. Primitives (`(decl u32 +(primitive u32))`) are assumed to be defined already, and are required +to be `Copy`. + +All imported/extern types are pulled in via `use super::*` at the top +of the generated code; thus, these types should exist in (or be +re-exported from) the parent module. + +### Symbolic Constants + +ISLE allows the user to refer to external constants as follows: + +```lisp + (extern const $I32 Type) +``` + +This allows code to refer to `$I32` whenever a value of type `Type` is +needed, in either a pattern (LHS) or an expression (RHS). These +constants are pulled in via the same `use super::*` that imports all +external types. + +### Exported Interface: Functions and Context Trait + +The generated ISLE code provides an interface that is designed to be +agnostic to the embedding application. This means that ISLE knows +nothing about, e.g., Cranelift or compiler concepts in +general. Rather, the generated code provides function entry points +with well-defined signatures based on the terms, and imports the +extern constructors and extractors via a trait that the embedder must +implement. + +When a term `T` is defined like + +```lisp + (decl T (A B C) U) +``` + +and has an internal constructor (provided by `rule` bodies), then a +function with the following signature will be exported from the +generated code: + +```rust + pub fn constructor_T(ctx: &mut C, arg0: &A, arg1: &B, arg2: &C) -> Option; +``` + +In other words, `constructor_` is prepended, and the function takes +the expected arguments, along with a "context" (more on this +below). It returns an `Option` because internal constructors can be +partial: if no rule's pattern matches, then the constructor +fails. Note that if a sub-constructor fails, no backtracking occurs; +rather, the failure propagates all the way up to the entry point. + +What is this "context" for? The context argument is used to allow +external extractors and constructors to access the necessary state of +the embedding application. (For example, in Cranelift, it might be the +`LowerCtx` that controls the code-lowering process.) + +The context is a trait because we want to decouple the generated code +from the application as much as possible. The trait will have a method +for each defined external extractor and constructor. For example, if +we have the following terms and declarations: + +```lisp + (decl A (u32 u32) T) + (extern constructor A build_a) + + (decl B (T) U) + (external extractor B disassemble_b) +``` + +then the `Context` trait will include these methods: + +```rust + trait Context { + fn build_a(&mut self, arg0: u32, arg1: u32) -> T; + fn disassemble_b(&mut self, arg0: &U) -> Option; + } +``` + +These functions should be implemented as described above for external +constructors and extractors. + +Note that some external extractors are known to always succeed, for +example if they are just fetching some information that is always +present; in this case, the generated code can be made slightly more +efficient if we tell the ISLE compiler that this is so. By declaring + +```lisp + (external extractor infallible B disassemble_b) +``` + +we eliminate the `Option` on the return type, so the method is instead + +```rust + trait Context { + // ... + fn disassemble_b(&mut self, arg0: &U) -> T; + } +``` + +## ISLE Internals + +### Compiler Stages + +Some detail and pointers to the compilation stages can be found in the +[README](../isle/README.md). The sequence starts as any ordinary +compiler: lexing, parsing, semantic analysis, and generation of an +IR. The most unique part is the "decision trie generation", which is +what converts the unordered-list-of-rule representation into something +that corresponds to the final Rust code's control flow and order of +matching operations. + +### Decision Trie + +The heart of the ISLE transformation lies in how the compiler converts +a list of rules into a scheme to attempt to match rules in some order, +possibly sharing match operations between similar rules to reduce +work. + +The core data structure we produce is a "decision trie" per internal +constructor body. This is an intermediate representation of sorts that +is built from individual-rule IR (LHS + RHS) sequences, and is then +used to generate Rust source. + +The decision trie is, as the name implies, a kind of decision tree, in +the sense that we start at the root and move down the tree based on +the result of match operations (each feeding one "decision"). + +It is a "trie" (which is a kind of tree) because at each level, its +edges are labeled with match operations; a trie is a tree where one +input character from an alphabet is used to index children at each +level. + +Each node in the tree is either an internal decision node, or a leaf +"expression" node (which we reach once we have a successful rule +match). The "execution semantics" of the trie are +backtracking-based. We attempt to find some path down the tree through +edges whose match ops run successfully; when we do this to reach a +leaf, we have the values generated by all of the match ops, and we can +execute the sequence of "expression instructions" in the leaf. Each +rule's left-hand side becomes a series of edges (merged into the +existing tree as we process rules) and each rule's right-hand side +becomes one leaf node with expression instructions. + +At any point, if a match op does not succeed, we try the next out-edge +in sequence. If we have tried all out-edges from a decision node and +none were successful, then we backtrack one level further. Thus, we +simply perform an in-order tree traversal and find the first +successful match. + +Though this sounds possibly very inefficient if some decision node has +a high fan-out, in practice it is not because the edges are often +known to be *mutually exclusive*. The canonical example of this is +when an enum-typed value is destructured into different variants by +various edges; we can use a Rust `match` statement in the generated +source and have `O(1)` (or close to it) cost for the dispatch at this +level. + +Building the trie is a somewhat subtle procedure because we must be +mindful of *priorities*. In order to ensure that rule priorities are +obeyed, we actually label each edge with a *priority range* and a +match op. Edges must then be considered in priority order. When +building the trie, we uphold an important *non-overlapping-priority* +invariant: there must not be two edges with overlapping priority +ranges and different match ops, or else it would be unclear which to +run next. Priority ranges are inclusive in a discrete (integer) space, +so "overlap" is also subtle: we can allow edges with "unit" priority +ranges (just one priority), and these do not overlap. So, e.g., we can +allow an edge with priority `[0, 10]` (higher is more prioritized) and +op `A`, followed by an edge with priority `[0, 0]` and op `B`, +followed by an edge with priority `[-1, -10]`, `A`, though they must +occur in that order. + +See [this block +comment](https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/isle/isle/src/trie.rs#L15-L166) +for more information regarding the trie construction algorithm. + +## Reference: ISLE Language Grammar + +Baseline: allow arbitrary whitespace, and Lisp-style comments (`;` to +newline). The grammar accepted by the parser is as follows: + +```bnf + + ::= * + + ::= "(" "type" ")" + | "(" "decl" ")" + | "(" "rule" ")" + | "(" "extractor" ")" + | "(" "extern" ")" + + ::= [ "extern" ] + + ::= ( "A".."Z" | "a".."z" | "_" | "$" ) + ( "A".."Z" | "a".."z" | "_" | "$" | "0".."9" | "." )* + ::= "$" ( "A".."Z" | "a".."z" | "_" | "$" | "0".."9" | "." )* + + ::= [ "-" ] ( "0".."9" )+ + | [ "-" ] "0x" ( "0".."9" "A".."F" "a".."f" )+ + + ::= "(" "primitive" ")" + | "(" "enum" * ")" + + ::= + | "(" * ")" + + ::= "(" ")" + + ::= + + ::= "(" * ")" + + ::= + | + + ::= + + ::= "(" * ")" + + ::= + | + | "_" + | "=" + | "@" + | "(" "and" * ")" + | "(" * ")" + + ::= + | "<" ;; in-argument to an extractor + + ::= + | + | "#t" ;; Scheme-like "true": shorthand for 1 + | "#f" ;; Scheme-like "false": shorthand for 0 + | + | "(" "let" "(" * ")" ")" + | "(" * ")" + + ::= "(" ")" + + ::= "constructor" + | "extractor" [ "infallible" ] + [ "(" * ")" ] + | "const" + + ::= "in" | "out" + +``` diff --git a/cranelift/isle/README.md b/cranelift/isle/README.md index 635fe863f8..3662e9663c 100644 --- a/cranelift/isle/README.md +++ b/cranelift/isle/README.md @@ -1,12 +1,13 @@ # ISLE: Instruction Selection/Lowering Expressions DSL +See also: [Language Reference](../docs/isle.md) + ## Table of Contents * [Introduction](#introduction) * [Example Usage](#example-usage) * [Tutorial](#tutorial) * [Implementation](#implementation) -* [Sketch of Instruction Selector](#sketch-of-instruction-selector) ## Introduction @@ -30,9 +31,7 @@ languages, and so should be translatable to formal constraints or other logical specification languages. Some more details and motivation are in [BA RFC -#15](https://github.com/bytecodealliance/rfcs/pull/15); additional -documentation will eventually be added to carefully specify the language -semantics. +#15](https://github.com/bytecodealliance/rfcs/pull/15). Reference documentation can be found [here](../docs/isle.md). ## Example Usage @@ -522,9 +521,3 @@ implements it. Relevant source files: * `isle/src/codegen.rs` - -## Sketch of Instruction Selector - -Please see [this Cranelift -branch](https://github.com/cfallin/wasmtime/tree/isle) for an ongoing sketch of -an instruction selector backend in Cranelift that uses ISLE.