|
|
|
@@ -21,11 +21,11 @@ instructions. For example:
|
|
|
|
|
|
|
|
|
|
|
|
- An `iadd` (integer add) operator can always be lowered to an x86
|
|
|
|
- An `iadd` (integer add) operator can always be lowered to an x86
|
|
|
|
`ADD` instruction with two register sources.
|
|
|
|
`ADD` instruction with two register sources.
|
|
|
|
|
|
|
|
|
|
|
|
- An `iadd` operator with one `iconst` (integer-constant) argument can
|
|
|
|
- An `iadd` operator with one `iconst` (integer-constant) argument can
|
|
|
|
be lowered to an x86 `ADD` instruction with a register and an
|
|
|
|
be lowered to an x86 `ADD` instruction with a register and an
|
|
|
|
immediate.
|
|
|
|
immediate.
|
|
|
|
|
|
|
|
|
|
|
|
One could write something like the following in ISLE (simplified from
|
|
|
|
One could write something like the following in ISLE (simplified from
|
|
|
|
the real code [here](../codegen/src/isa/x64/lower.isle)):
|
|
|
|
the real code [here](../codegen/src/isa/x64/lower.isle)):
|
|
|
|
|
|
|
|
|
|
|
|
@@ -43,7 +43,7 @@ the real code [here](../codegen/src/isa/x64/lower.isle)):
|
|
|
|
;; `y` is a `RegMemImm.Imm`.
|
|
|
|
;; `y` is a `RegMemImm.Imm`.
|
|
|
|
y)))
|
|
|
|
y)))
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
ISLE lets the compiler backend developer express this information in a
|
|
|
|
ISLE lets the compiler backend developer express this information in a
|
|
|
|
declarative way -- i.e., just write down a list of patterns, without
|
|
|
|
declarative way -- i.e., just write down a list of patterns, without
|
|
|
|
worrying how the compilation process tries them out -- and the ISLE
|
|
|
|
worrying how the compilation process tries them out -- and the ISLE
|
|
|
|
@@ -68,16 +68,16 @@ This document is organized into the following sections:
|
|
|
|
systems work, how to think about nested terms, patterns and rewrite
|
|
|
|
systems work, how to think about nested terms, patterns and rewrite
|
|
|
|
rules, how they provide a general mechanism for computation, and how
|
|
|
|
rules, how they provide a general mechanism for computation, and how
|
|
|
|
term-rewriting is often used in a compiler-implementation context.
|
|
|
|
term-rewriting is often used in a compiler-implementation context.
|
|
|
|
|
|
|
|
|
|
|
|
* Core ISLE: the foundational concepts of the ISLE DSL, building upon
|
|
|
|
* Core ISLE: the foundational concepts of the ISLE DSL, building upon
|
|
|
|
a general-purpose term-rewriting base. Covers the type system (typed
|
|
|
|
a general-purpose term-rewriting base. Covers the type system (typed
|
|
|
|
terms) and how rules are written.
|
|
|
|
terms) and how rules are written.
|
|
|
|
|
|
|
|
|
|
|
|
* ISLE with Rust: covers how ISLE provides an "FFI" (foreign function
|
|
|
|
* ISLE with Rust: covers how ISLE provides an "FFI" (foreign function
|
|
|
|
interface) of sorts to allow interaction with Rust code, and
|
|
|
|
interface) of sorts to allow interaction with Rust code, and
|
|
|
|
describes the scheme by which ISLE execution is mapped onto Rust
|
|
|
|
describes the scheme by which ISLE execution is mapped onto Rust
|
|
|
|
(data structures and control flow).[^1]
|
|
|
|
(data structures and control flow).[^1]
|
|
|
|
|
|
|
|
|
|
|
|
* ISLE Internals: describes how the ISLE compiler works. Provides
|
|
|
|
* ISLE Internals: describes how the ISLE compiler works. Provides
|
|
|
|
insight into how an unordered collection of rewrite rules are
|
|
|
|
insight into how an unordered collection of rewrite rules are
|
|
|
|
combined into executable Rust code that efficiently traverses the
|
|
|
|
combined into executable Rust code that efficiently traverses the
|
|
|
|
@@ -161,7 +161,7 @@ form, or at least can be interpreted that way.[^3]
|
|
|
|
structure that is present in those symbols in any well-formed
|
|
|
|
structure that is present in those symbols in any well-formed
|
|
|
|
sequence. For example, we can define a TRS that only operates on
|
|
|
|
sequence. For example, we can define a TRS that only operates on
|
|
|
|
terms with balanced parentheses; then we have our tree.
|
|
|
|
terms with balanced parentheses; then we have our tree.
|
|
|
|
|
|
|
|
|
|
|
|
In ISLE and hence in this document, we operate on terms that are
|
|
|
|
In ISLE and hence in this document, we operate on terms that are
|
|
|
|
written in an
|
|
|
|
written in an
|
|
|
|
[S-expression](https://en.wikipedia.org/wiki/S-expression) syntax,
|
|
|
|
[S-expression](https://en.wikipedia.org/wiki/S-expression) syntax,
|
|
|
|
@@ -239,9 +239,9 @@ right-hand nomenclature comes from a common way of writing rules as:
|
|
|
|
|
|
|
|
|
|
|
|
```plain
|
|
|
|
```plain
|
|
|
|
A -> B ;; any term "A" is rewritten to "B"
|
|
|
|
A -> B ;; any term "A" is rewritten to "B"
|
|
|
|
|
|
|
|
|
|
|
|
(A x) -> (B (C x)) ;; any term (A x), for some x, is rewritten to (B (C x)).
|
|
|
|
(A x) -> (B (C x)) ;; any term (A x), for some x, is rewritten to (B (C x)).
|
|
|
|
|
|
|
|
|
|
|
|
(A _) -> (D) ;; any term (A _), where `_` is a wildcard (any subterm),
|
|
|
|
(A _) -> (D) ;; any term (A _), where `_` is a wildcard (any subterm),
|
|
|
|
;; is rewritten to (D).
|
|
|
|
;; is rewritten to (D).
|
|
|
|
```
|
|
|
|
```
|
|
|
|
@@ -254,16 +254,16 @@ subterms:
|
|
|
|
|
|
|
|
|
|
|
|
* `(A pat1 pat2 ...)` matches a constructor `A` with patterms for each
|
|
|
|
* `(A pat1 pat2 ...)` matches a constructor `A` with patterms for each
|
|
|
|
of its arguments.
|
|
|
|
of its arguments.
|
|
|
|
|
|
|
|
|
|
|
|
* `x` matches any subterm and captures its value in a variable
|
|
|
|
* `x` matches any subterm and captures its value in a variable
|
|
|
|
binding, which can be used later when we specify the right-hand side
|
|
|
|
binding, which can be used later when we specify the right-hand side
|
|
|
|
(so that the rewrite contains parts of the original term).
|
|
|
|
(so that the rewrite contains parts of the original term).
|
|
|
|
|
|
|
|
|
|
|
|
* `_` is a wildcard and matches anything, without capturing it.
|
|
|
|
* `_` is a wildcard and matches anything, without capturing it.
|
|
|
|
|
|
|
|
|
|
|
|
* Primitive constant values, such as `42` or `$Symbol`, match only if
|
|
|
|
* Primitive constant values, such as `42` or `$Symbol`, match only if
|
|
|
|
the term is exactly equal to this constant.
|
|
|
|
the term is exactly equal to this constant.
|
|
|
|
|
|
|
|
|
|
|
|
These pattern-matching operators can be combined, so we could write,
|
|
|
|
These pattern-matching operators can be combined, so we could write,
|
|
|
|
for example, `(A (B x _) z)`. This pattern would match the term `(A (B
|
|
|
|
for example, `(A (B x _) z)`. This pattern would match the term `(A (B
|
|
|
|
1 2) 3)` but not `(A (C 4 5) 6)`.
|
|
|
|
1 2) 3)` but not `(A (C 4 5) 6)`.
|
|
|
|
@@ -388,7 +388,7 @@ clarity.
|
|
|
|
separate Rust functions, so factoring rules to use intermediate
|
|
|
|
separate Rust functions, so factoring rules to use intermediate
|
|
|
|
terms can provide code-size and compile-time benefits for the
|
|
|
|
terms can provide code-size and compile-time benefits for the
|
|
|
|
ISLE-generated Rust code as well.
|
|
|
|
ISLE-generated Rust code as well.
|
|
|
|
|
|
|
|
|
|
|
|
[^6]: The [lambda calculus' reduction
|
|
|
|
[^6]: The [lambda calculus' reduction
|
|
|
|
rules](https://en.wikipedia.org/wiki/Lambda_calculus#Reduction)
|
|
|
|
rules](https://en.wikipedia.org/wiki/Lambda_calculus#Reduction)
|
|
|
|
are a good example of this.
|
|
|
|
are a good example of this.
|
|
|
|
@@ -415,9 +415,9 @@ rewrites it into a machine-*dependent* instruction term. For example:
|
|
|
|
|
|
|
|
|
|
|
|
```plain
|
|
|
|
```plain
|
|
|
|
(iadd a b) -> (isa.add_reg_reg a b)
|
|
|
|
(iadd a b) -> (isa.add_reg_reg a b)
|
|
|
|
|
|
|
|
|
|
|
|
(iadd a (iconst 0)) -> a
|
|
|
|
(iadd a (iconst 0)) -> a
|
|
|
|
|
|
|
|
|
|
|
|
(iadd a (iconst n)) (isa.add_reg_imm a n)
|
|
|
|
(iadd a (iconst n)) (isa.add_reg_imm a n)
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
@@ -494,8 +494,11 @@ The pattern (left-hand side) is made up of the following match
|
|
|
|
operators:
|
|
|
|
operators:
|
|
|
|
|
|
|
|
|
|
|
|
* Wildcards (`_`).
|
|
|
|
* Wildcards (`_`).
|
|
|
|
* Integer constants (decimal/hex, positive/negative: `1`, `-1`,
|
|
|
|
* Integer constants (decimal/hex/binary/octal, positive/negative: `1`, `-1`,
|
|
|
|
`0x80`, `-0x80`) and boolean constants (`#t`, `#f`).
|
|
|
|
`0x80`, `-0x80`) and boolean constants (`#t`, `#f`). Hex constants can
|
|
|
|
|
|
|
|
start with either `0x` or `0X`. Binary constants start with `0b`. Octal
|
|
|
|
|
|
|
|
constants start with `0o`. Integers can also be interspersed with `_` as a
|
|
|
|
|
|
|
|
separator, for example `1_000` or `0x1234_5678`, for readability.
|
|
|
|
* constants imported from the embedding, of arbitrary type
|
|
|
|
* constants imported from the embedding, of arbitrary type
|
|
|
|
(`$MyConst`).
|
|
|
|
(`$MyConst`).
|
|
|
|
* Variable captures and matches (bare identifiers like `x`; an
|
|
|
|
* Variable captures and matches (bare identifiers like `x`; an
|
|
|
|
@@ -515,7 +518,7 @@ operators:
|
|
|
|
defined term (type variant or constructor) and the subpatterns are
|
|
|
|
defined term (type variant or constructor) and the subpatterns are
|
|
|
|
applied to each argument value in turn. Note that `term` cannot be a
|
|
|
|
applied to each argument value in turn. Note that `term` cannot be a
|
|
|
|
wildcard; it must be a specific, concrete term.
|
|
|
|
wildcard; it must be a specific, concrete term.
|
|
|
|
|
|
|
|
|
|
|
|
The expression (right-hand side) is made up of the following
|
|
|
|
The expression (right-hand side) is made up of the following
|
|
|
|
expression operators:
|
|
|
|
expression operators:
|
|
|
|
|
|
|
|
|
|
|
|
@@ -530,7 +533,7 @@ expression operators:
|
|
|
|
to the immediately previous variable bindings (i.e., this is like a
|
|
|
|
to the immediately previous variable bindings (i.e., this is like a
|
|
|
|
`let*` in Scheme). `let`s are lexically-scoped, meaning that bound
|
|
|
|
`let*` in Scheme). `let`s are lexically-scoped, meaning that bound
|
|
|
|
variables are available only within the body of the `let`.
|
|
|
|
variables are available only within the body of the `let`.
|
|
|
|
|
|
|
|
|
|
|
|
When multiple rules are applicable to rewrite a particular term, ISLE
|
|
|
|
When multiple rules are applicable to rewrite a particular term, ISLE
|
|
|
|
will choose the "more specific" rule according to a particular
|
|
|
|
will choose the "more specific" rule according to a particular
|
|
|
|
heuristic: in the lowered sequence of matching steps, when one
|
|
|
|
heuristic: in the lowered sequence of matching steps, when one
|
|
|
|
@@ -573,14 +576,14 @@ of type definitions are:
|
|
|
|
|
|
|
|
|
|
|
|
(type u32 (primitive u32)) ;; u32 is a primitive, and is
|
|
|
|
(type u32 (primitive u32)) ;; u32 is a primitive, and is
|
|
|
|
;; spelled `u32` in the generated Rust code.
|
|
|
|
;; spelled `u32` in the generated Rust code.
|
|
|
|
|
|
|
|
|
|
|
|
(type MyType (enum
|
|
|
|
(type MyType (enum
|
|
|
|
(A (x u32) (y u32))
|
|
|
|
(A (x u32) (y u32))
|
|
|
|
(B (z u32))
|
|
|
|
(B (z u32))
|
|
|
|
(C))) ;; MyType is an enum, with variants
|
|
|
|
(C))) ;; MyType is an enum, with variants
|
|
|
|
;; `MyType::A { x, y }`, `MyType::B { z }`,
|
|
|
|
;; `MyType::A { x, y }`, `MyType::B { z }`,
|
|
|
|
;; and `MyType::C`.
|
|
|
|
;; and `MyType::C`.
|
|
|
|
|
|
|
|
|
|
|
|
(type MyType2 extern (enum (A)))
|
|
|
|
(type MyType2 extern (enum (A)))
|
|
|
|
;; MyType2 is an enum with variant `MyType2::A`.
|
|
|
|
;; MyType2 is an enum with variant `MyType2::A`.
|
|
|
|
;; Its type definition is not included in the
|
|
|
|
;; Its type definition is not included in the
|
|
|
|
@@ -611,11 +614,11 @@ automatically have the following constructors:
|
|
|
|
;; These definitions are implicit and do not need to be written (doing
|
|
|
|
;; These definitions are implicit and do not need to be written (doing
|
|
|
|
;; so is a compile-time error, actually). We write them here just to
|
|
|
|
;; so is a compile-time error, actually). We write them here just to
|
|
|
|
;; show what they would look like.
|
|
|
|
;; show what they would look like.
|
|
|
|
|
|
|
|
|
|
|
|
(decl MyType.A (u32 u32) MyType)
|
|
|
|
(decl MyType.A (u32 u32) MyType)
|
|
|
|
(decl MyType.B (u32) MyType)
|
|
|
|
(decl MyType.B (u32) MyType)
|
|
|
|
(decl MyType.C () MyType)
|
|
|
|
(decl MyType.C () MyType)
|
|
|
|
|
|
|
|
|
|
|
|
(decl MyType2.A () MyType2)
|
|
|
|
(decl MyType2.A () MyType2)
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
@@ -671,39 +674,39 @@ The typing rules for patterns in ISLE are:
|
|
|
|
constant, etc.). This is because compilation and dispatch into rules
|
|
|
|
constant, etc.). This is because compilation and dispatch into rules
|
|
|
|
is organized by the top-level constructor of the term being
|
|
|
|
is organized by the top-level constructor of the term being
|
|
|
|
rewritten.
|
|
|
|
rewritten.
|
|
|
|
|
|
|
|
|
|
|
|
* At each part of the pattern except the root, there is an "expected
|
|
|
|
* At each part of the pattern except the root, there is an "expected
|
|
|
|
type" that is inferred from the surrounding context. We check that
|
|
|
|
type" that is inferred from the surrounding context. We check that
|
|
|
|
this matches the actual type of the pattern.
|
|
|
|
this matches the actual type of the pattern.
|
|
|
|
|
|
|
|
|
|
|
|
* A constructor pattern `(C x y z)`, given a constructor `(decl C (T1
|
|
|
|
* A constructor pattern `(C x y z)`, given a constructor `(decl C (T1
|
|
|
|
T2 T2) R)`, has type `R` and provides expected types `T1`, `T2`, and
|
|
|
|
T2 T2) R)`, has type `R` and provides expected types `T1`, `T2`, and
|
|
|
|
`T3` to its subpatterns.
|
|
|
|
`T3` to its subpatterns.
|
|
|
|
|
|
|
|
|
|
|
|
* A variable capture pattern `x` is compatible with any expected type
|
|
|
|
* A variable capture pattern `x` is compatible with any expected type
|
|
|
|
the first time it appears, and captures this expected type under the
|
|
|
|
the first time it appears, and captures this expected type under the
|
|
|
|
variable identifier `x` in the type environment. Subsequent
|
|
|
|
variable identifier `x` in the type environment. Subsequent
|
|
|
|
appearances of `x` check that the expected type matches the
|
|
|
|
appearances of `x` check that the expected type matches the
|
|
|
|
already-captured type.
|
|
|
|
already-captured type.
|
|
|
|
|
|
|
|
|
|
|
|
* A conjunction `(and PAT1 PAT2 ...)` checks that each subpattern is
|
|
|
|
* A conjunction `(and PAT1 PAT2 ...)` checks that each subpattern is
|
|
|
|
compatible with the expected type.
|
|
|
|
compatible with the expected type.
|
|
|
|
|
|
|
|
|
|
|
|
* Integer constants are compatible with any primitive expected
|
|
|
|
* Integer constants are compatible with any primitive expected
|
|
|
|
type. (This may change in the future if we add non-numeric
|
|
|
|
type. (This may change in the future if we add non-numeric
|
|
|
|
primitives, such as strings.)
|
|
|
|
primitives, such as strings.)
|
|
|
|
|
|
|
|
|
|
|
|
If we are able to typecheck the pattern, we have a type environment
|
|
|
|
If we are able to typecheck the pattern, we have a type environment
|
|
|
|
that is a map from variable bindings to types: e.g., `{ x: MyType, y:
|
|
|
|
that is a map from variable bindings to types: e.g., `{ x: MyType, y:
|
|
|
|
MyType2, z: u32 }`. We then typecheck the rewrite expression.
|
|
|
|
MyType2, z: u32 }`. We then typecheck the rewrite expression.
|
|
|
|
|
|
|
|
|
|
|
|
* Every expression also has an expected type, from the surrounding
|
|
|
|
* Every expression also has an expected type, from the surrounding
|
|
|
|
context. We check that the provided expression matches this type.
|
|
|
|
context. We check that the provided expression matches this type.
|
|
|
|
|
|
|
|
|
|
|
|
* The top-level rewrite expression must have the same type as the
|
|
|
|
* The top-level rewrite expression must have the same type as the
|
|
|
|
top-level constructor in the pattern. (In other words, a term can
|
|
|
|
top-level constructor in the pattern. (In other words, a term can
|
|
|
|
only be rewritten to another term of the same type.)
|
|
|
|
only be rewritten to another term of the same type.)
|
|
|
|
|
|
|
|
|
|
|
|
* Constructors check their return values against the expected type,
|
|
|
|
* Constructors check their return values against the expected type,
|
|
|
|
and typecheck their argument expressions against their parameter
|
|
|
|
and typecheck their argument expressions against their parameter
|
|
|
|
types.
|
|
|
|
types.
|
|
|
|
@@ -712,7 +715,7 @@ MyType2, z: u32 }`. We then typecheck the rewrite expression.
|
|
|
|
these are added to the type environment while typechecking the
|
|
|
|
these are added to the type environment while typechecking the
|
|
|
|
body. The expected type for the body is the same as the expected
|
|
|
|
body. The expected type for the body is the same as the expected
|
|
|
|
type for the `let` itself.
|
|
|
|
type for the `let` itself.
|
|
|
|
|
|
|
|
|
|
|
|
### A Note on Heterogeneous Types
|
|
|
|
### A Note on Heterogeneous Types
|
|
|
|
|
|
|
|
|
|
|
|
We should illuminate one particular aspect of the ISLE type system
|
|
|
|
We should illuminate one particular aspect of the ISLE type system
|
|
|
|
@@ -726,9 +729,9 @@ a `T2`. Concretely:
|
|
|
|
```lisp
|
|
|
|
```lisp
|
|
|
|
(type T1 ...)
|
|
|
|
(type T1 ...)
|
|
|
|
(type T2 ...)
|
|
|
|
(type T2 ...)
|
|
|
|
|
|
|
|
|
|
|
|
(decl Translate (T1) T2)
|
|
|
|
(decl Translate (T1) T2)
|
|
|
|
|
|
|
|
|
|
|
|
(rule (Translate (T1.A ...))
|
|
|
|
(rule (Translate (T1.A ...))
|
|
|
|
(T2.X ...))
|
|
|
|
(T2.X ...))
|
|
|
|
(rule (Translate (T1.B ...))
|
|
|
|
(rule (Translate (T1.B ...))
|
|
|
|
@@ -760,10 +763,10 @@ one could have:
|
|
|
|
(type IR ...)
|
|
|
|
(type IR ...)
|
|
|
|
(type Machine1 ...)
|
|
|
|
(type Machine1 ...)
|
|
|
|
(type Machine2 ...)
|
|
|
|
(type Machine2 ...)
|
|
|
|
|
|
|
|
|
|
|
|
(decl TranslateToMachine1 (IR) Machine1)
|
|
|
|
(decl TranslateToMachine1 (IR) Machine1)
|
|
|
|
(decl TranslateToMachine2 (IR) Machine2)
|
|
|
|
(decl TranslateToMachine2 (IR) Machine2)
|
|
|
|
|
|
|
|
|
|
|
|
(rule (TranslateToMachine1 (IR.add a b)) (Machine1.add a b))
|
|
|
|
(rule (TranslateToMachine1 (IR.add a b)) (Machine1.add a b))
|
|
|
|
(rule (TranslateToMachine2 (IR.add a b)) (Machine2.weird_inst a b))
|
|
|
|
(rule (TranslateToMachine2 (IR.add a b)) (Machine2.weird_inst a b))
|
|
|
|
```
|
|
|
|
```
|
|
|
|
@@ -872,7 +875,7 @@ For example, if one is writing a rule such as
|
|
|
|
```lisp
|
|
|
|
```lisp
|
|
|
|
(decl u_to_v (U) V)
|
|
|
|
(decl u_to_v (U) V)
|
|
|
|
(rule ...)
|
|
|
|
(rule ...)
|
|
|
|
|
|
|
|
|
|
|
|
(decl MyTerm (T) V)
|
|
|
|
(decl MyTerm (T) V)
|
|
|
|
(rule (MyTerm t)
|
|
|
|
(rule (MyTerm t)
|
|
|
|
(u_to_v t))
|
|
|
|
(u_to_v t))
|
|
|
|
@@ -885,7 +888,7 @@ its argument but `t` has type `T`. However, if we define
|
|
|
|
|
|
|
|
|
|
|
|
```lisp
|
|
|
|
```lisp
|
|
|
|
(convert T U t_to_u)
|
|
|
|
(convert T U t_to_u)
|
|
|
|
|
|
|
|
|
|
|
|
;; For the above to be valid, `t_to_u` should be declared with the
|
|
|
|
;; For the above to be valid, `t_to_u` should be declared with the
|
|
|
|
;; signature:
|
|
|
|
;; signature:
|
|
|
|
(decl t_to_u (T) U)
|
|
|
|
(decl t_to_u (T) U)
|
|
|
|
@@ -904,12 +907,12 @@ This also works in the extractor position: for example, if one writes
|
|
|
|
```lisp
|
|
|
|
```lisp
|
|
|
|
(decl defining_instruction (Inst) Value)
|
|
|
|
(decl defining_instruction (Inst) Value)
|
|
|
|
(extern extractor definining_instruction ...)
|
|
|
|
(extern extractor definining_instruction ...)
|
|
|
|
|
|
|
|
|
|
|
|
(decl iadd (Value Value) Inst)
|
|
|
|
(decl iadd (Value Value) Inst)
|
|
|
|
|
|
|
|
|
|
|
|
(rule (lower (iadd (iadd a b) c))
|
|
|
|
(rule (lower (iadd (iadd a b) c))
|
|
|
|
...)
|
|
|
|
...)
|
|
|
|
|
|
|
|
|
|
|
|
(convert Inst Value defining_instruction)
|
|
|
|
(convert Inst Value defining_instruction)
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
@@ -933,12 +936,12 @@ A term can have:
|
|
|
|
|
|
|
|
|
|
|
|
1. A single internal extractor body, via a toplevel `(extractor ...)`
|
|
|
|
1. A single internal extractor body, via a toplevel `(extractor ...)`
|
|
|
|
form, OR
|
|
|
|
form, OR
|
|
|
|
|
|
|
|
|
|
|
|
2. A single external extractor binding (see next section); AND
|
|
|
|
2. A single external extractor binding (see next section); AND
|
|
|
|
|
|
|
|
|
|
|
|
3. One or more `(rule (Term ...) ...)` toplevel forms, which together
|
|
|
|
3. One or more `(rule (Term ...) ...)` toplevel forms, which together
|
|
|
|
make up an internal constructor definition, OR
|
|
|
|
make up an internal constructor definition, OR
|
|
|
|
|
|
|
|
|
|
|
|
4. A single external constructor binding (see next section).
|
|
|
|
4. A single external constructor binding (see next section).
|
|
|
|
|
|
|
|
|
|
|
|
### If-Let Clauses
|
|
|
|
### If-Let Clauses
|
|
|
|
@@ -1030,8 +1033,8 @@ for purity.
|
|
|
|
|
|
|
|
|
|
|
|
#### `partial` Expressions
|
|
|
|
#### `partial` Expressions
|
|
|
|
|
|
|
|
|
|
|
|
ISLE's `partial` keyword on a term indicates that the term's
|
|
|
|
ISLE's `partial` keyword on a term indicates that the term's
|
|
|
|
constructors may fail to match, otherwise, the ISLE compiler assumes
|
|
|
|
constructors may fail to match, otherwise, the ISLE compiler assumes
|
|
|
|
the term's constructors are infallible.
|
|
|
|
the term's constructors are infallible.
|
|
|
|
|
|
|
|
|
|
|
|
For example, the following term's constructor only matches if the value
|
|
|
|
For example, the following term's constructor only matches if the value
|
|
|
|
@@ -1043,7 +1046,7 @@ is zero:
|
|
|
|
(extern constructor is_zero_value is_zero_value)
|
|
|
|
(extern constructor is_zero_value is_zero_value)
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Internal constructors without the `partial` keyword can
|
|
|
|
Internal constructors without the `partial` keyword can
|
|
|
|
only use other constructors that also do not have the `partial` keyword.
|
|
|
|
only use other constructors that also do not have the `partial` keyword.
|
|
|
|
|
|
|
|
|
|
|
|
#### `if` Shorthand
|
|
|
|
#### `if` Shorthand
|
|
|
|
@@ -1104,7 +1107,7 @@ Rust. The basic principles are:
|
|
|
|
side expression; this can invoke further constructors for its
|
|
|
|
side expression; this can invoke further constructors for its
|
|
|
|
subparts, kicking off more rewrites, until eventually a value is
|
|
|
|
subparts, kicking off more rewrites, until eventually a value is
|
|
|
|
returned.
|
|
|
|
returned.
|
|
|
|
|
|
|
|
|
|
|
|
4. This design means that "intermediate terms" -- constructed terms
|
|
|
|
4. This design means that "intermediate terms" -- constructed terms
|
|
|
|
that are then further rewritten -- are never actually built as
|
|
|
|
that are then further rewritten -- are never actually built as
|
|
|
|
in-memory data-structures. Rather, they exist only as ephemeral
|
|
|
|
in-memory data-structures. Rather, they exist only as ephemeral
|
|
|
|
@@ -1112,7 +1115,7 @@ Rust. The basic principles are:
|
|
|
|
means that there is very little or no performance penalty to
|
|
|
|
means that there is very little or no performance penalty to
|
|
|
|
factoring code into many sub-rules (subject only to function-call
|
|
|
|
factoring code into many sub-rules (subject only to function-call
|
|
|
|
overhead and/or the effectiveness of the Rust inliner).
|
|
|
|
overhead and/or the effectiveness of the Rust inliner).
|
|
|
|
|
|
|
|
|
|
|
|
5. Backtracking -- attempting to match rules, and backing up to follow
|
|
|
|
5. Backtracking -- attempting to match rules, and backing up to follow
|
|
|
|
a different path when a match fails -- exists, but is entirely
|
|
|
|
a different path when a match fails -- exists, but is entirely
|
|
|
|
internal to the generated Rust function for rewriting one
|
|
|
|
internal to the generated Rust function for rewriting one
|
|
|
|
@@ -1123,16 +1126,16 @@ Rust. The basic principles are:
|
|
|
|
sides, trying to find a matching rule, and once we find one, we
|
|
|
|
sides, trying to find a matching rule, and once we find one, we
|
|
|
|
commit and start to invoke constructors to build the right-hand
|
|
|
|
commit and start to invoke constructors to build the right-hand
|
|
|
|
side.
|
|
|
|
side.
|
|
|
|
|
|
|
|
|
|
|
|
Said another way, the principle is that left-hand sides can be
|
|
|
|
Said another way, the principle is that left-hand sides can be
|
|
|
|
fallible, and have no side-effects as they execute; right-hand
|
|
|
|
fallible, and have no side-effects as they execute; right-hand
|
|
|
|
sides, in contrast, are infallible. This simplifies the control
|
|
|
|
sides, in contrast, are infallible. This simplifies the control
|
|
|
|
flow and makes reasoning about side-effects (especially with
|
|
|
|
flow and makes reasoning about side-effects (especially with
|
|
|
|
respect to external Rust actions) easier.
|
|
|
|
respect to external Rust actions) easier.
|
|
|
|
|
|
|
|
|
|
|
|
This will become more clear as we look at how Rust interfaces are
|
|
|
|
This will become more clear as we look at how Rust interfaces are
|
|
|
|
defined, and how the generated code appears, below.
|
|
|
|
defined, and how the generated code appears, below.
|
|
|
|
|
|
|
|
|
|
|
|
### Extern Constructors and Extractors
|
|
|
|
### Extern Constructors and Extractors
|
|
|
|
|
|
|
|
|
|
|
|
ISLE programs interact with the surrounding Rust code in which they
|
|
|
|
ISLE programs interact with the surrounding Rust code in which they
|
|
|
|
@@ -1181,11 +1184,11 @@ and returns a `U`.
|
|
|
|
External constructors are infallible: that is, they must succeed, and
|
|
|
|
External constructors are infallible: that is, they must succeed, and
|
|
|
|
always return their return type. In contrast, internal constructors
|
|
|
|
always return their return type. In contrast, internal constructors
|
|
|
|
can be fallible because they are implemented by a list of rules whose
|
|
|
|
can be fallible because they are implemented by a list of rules whose
|
|
|
|
patterns may not cover the entire domain (in which case, the term
|
|
|
|
patterns may not cover the entire domain (in which case, the term
|
|
|
|
should be marked `partial`). If fallible behavior is needed when
|
|
|
|
should be marked `partial`). If fallible behavior is needed when
|
|
|
|
invoking external Rust code, that behavior should occur in an extractor
|
|
|
|
invoking external Rust code, that behavior should occur in an extractor
|
|
|
|
(see below) instead: only pattern left-hand sides are meant to be
|
|
|
|
(see below) instead: only pattern left-hand sides are meant to be
|
|
|
|
fallible.
|
|
|
|
fallible.
|
|
|
|
|
|
|
|
|
|
|
|
#### Extractors
|
|
|
|
#### Extractors
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1265,7 +1268,7 @@ This allows code to refer to `$I32` whenever a value of type `Type` is
|
|
|
|
needed, in either a pattern (LHS) or an expression (RHS). These
|
|
|
|
needed, in either a pattern (LHS) or an expression (RHS). These
|
|
|
|
constants are pulled in via the same `use super::*` that imports all
|
|
|
|
constants are pulled in via the same `use super::*` that imports all
|
|
|
|
external types.
|
|
|
|
external types.
|
|
|
|
|
|
|
|
|
|
|
|
### Exported Interface: Functions and Context Trait
|
|
|
|
### Exported Interface: Functions and Context Trait
|
|
|
|
|
|
|
|
|
|
|
|
The generated ISLE code provides an interface that is designed to be
|
|
|
|
The generated ISLE code provides an interface that is designed to be
|
|
|
|
@@ -1310,7 +1313,7 @@ we have the following terms and declarations:
|
|
|
|
```lisp
|
|
|
|
```lisp
|
|
|
|
(decl A (u32 u32) T)
|
|
|
|
(decl A (u32 u32) T)
|
|
|
|
(extern constructor A build_a)
|
|
|
|
(extern constructor A build_a)
|
|
|
|
|
|
|
|
|
|
|
|
(decl B (T) U)
|
|
|
|
(decl B (T) U)
|
|
|
|
(external extractor B disassemble_b)
|
|
|
|
(external extractor B disassemble_b)
|
|
|
|
```
|
|
|
|
```
|
|
|
|
@@ -1440,7 +1443,7 @@ newline). The grammar accepted by the parser is as follows:
|
|
|
|
| "(" "rule" <rule> ")"
|
|
|
|
| "(" "rule" <rule> ")"
|
|
|
|
| "(" "extractor" <etor> ")"
|
|
|
|
| "(" "extractor" <etor> ")"
|
|
|
|
| "(" "extern" <extern> ")"
|
|
|
|
| "(" "extern" <extern> ")"
|
|
|
|
|
|
|
|
|
|
|
|
<typedecl> ::= <ident> [ "extern" ] <typevalue>
|
|
|
|
<typedecl> ::= <ident> [ "extern" ] <typevalue>
|
|
|
|
|
|
|
|
|
|
|
|
<ident> ::= ( "A".."Z" | "a".."z" | "_" | "$" )
|
|
|
|
<ident> ::= ( "A".."Z" | "a".."z" | "_" | "$" )
|
|
|
|
@@ -1449,6 +1452,8 @@ newline). The grammar accepted by the parser is as follows:
|
|
|
|
|
|
|
|
|
|
|
|
<int> ::= [ "-" ] ( "0".."9" )+
|
|
|
|
<int> ::= [ "-" ] ( "0".."9" )+
|
|
|
|
| [ "-" ] "0x" ( "0".."9" "A".."F" "a".."f" )+
|
|
|
|
| [ "-" ] "0x" ( "0".."9" "A".."F" "a".."f" )+
|
|
|
|
|
|
|
|
| [ "-" ] "0o" ( "0".."7" )+
|
|
|
|
|
|
|
|
| [ "-" ] "0b" ( "0".."1" )+
|
|
|
|
|
|
|
|
|
|
|
|
<typevalue> ::= "(" "primitive" <ident> ")"
|
|
|
|
<typevalue> ::= "(" "primitive" <ident> ")"
|
|
|
|
| "(" "enum" <enumvariant>* ")"
|
|
|
|
| "(" "enum" <enumvariant>* ")"
|
|
|
|
@@ -1464,7 +1469,7 @@ newline). The grammar accepted by the parser is as follows:
|
|
|
|
|
|
|
|
|
|
|
|
<rule> ::= <pattern> <expr>
|
|
|
|
<rule> ::= <pattern> <expr>
|
|
|
|
| <prio> <pattern> <expr>
|
|
|
|
| <prio> <pattern> <expr>
|
|
|
|
|
|
|
|
|
|
|
|
<prio> ::= <int>
|
|
|
|
<prio> ::= <int>
|
|
|
|
|
|
|
|
|
|
|
|
<etor> ::= "(" <ident> <ident>* ")" <pattern>
|
|
|
|
<etor> ::= "(" <ident> <ident>* ")" <pattern>
|
|
|
|
@@ -1487,7 +1492,7 @@ newline). The grammar accepted by the parser is as follows:
|
|
|
|
| <ident>
|
|
|
|
| <ident>
|
|
|
|
| "(" "let" "(" <let-binding>* ")" <expr> ")"
|
|
|
|
| "(" "let" "(" <let-binding>* ")" <expr> ")"
|
|
|
|
| "(" <ident> <expr>* ")"
|
|
|
|
| "(" <ident> <expr>* ")"
|
|
|
|
|
|
|
|
|
|
|
|
<let-binding> ::= "(" <ident> <ty> <expr> ")"
|
|
|
|
<let-binding> ::= "(" <ident> <ty> <expr> ")"
|
|
|
|
|
|
|
|
|
|
|
|
<extern> ::= "constructor" <ident> <ident>
|
|
|
|
<extern> ::= "constructor" <ident> <ident>
|
|
|
|
|