While type-checking the AST for a pattern, ISLE was passing in an
`Option<TypeId>` for the expected result type of the pattern. However,
at every call we either passed `Some` type explicitly, or passed the
parent's expected type in a self-recursive call.
Therefore, by induction, `expected_ty` is never `None`. So this PR
unwraps the type everywhere. That in turn shows that a bunch of error
messages were unreachable, so this deletes a bunch of error-handling
code.
In addition, this function returned the type it computed for the
sub-pattern, but that information is already available in the
sub-pattern itself. Not only that but the type should always be equal to
`expected_ty`; when it isn't, we've reported a type error and are just
trying to check for more errors.
Most callers ignored the returned type but in some cases we used it to
try to avoid emitting useless error messages. I've preserved that
behavior for bind-patterns.
For and-patterns, the returned type looked like it was being used, but
because `expected_ty` was never `None`, the fallback of "fill in with
the sub-pattern's type" never fired. So I've deleted that fallback.
Finally, this reverts #4915 (9d99eff6f9)
which was introduced to flatten nested and-patterns, to simplify overlap
checking. However, the visitor trait used by trie_again effectively
flattens and-patterns anyway, so the current representation used for
overlap checking doesn't need this any more.
* Use is_ascii_digit and is_ascii_hexdigit in the ISLE lexer
* Use range pattern in ISLE lexer
* Use a couple of shorthands in the ISLE parser
* Use parse_ident instead of symbol + str_to_ident
* Introduce token eating api
This is a non-fatal version of the take api
* Rename take to expect and add expect_ prefixes to several methods
* Review comments
In multi-terms, all matching rules fire. We treat the result as an
unordered set of values, so setting rule priorities is meaningless. We
want to prohibit relying on the rule match order in this case.
Also, codegen can produce invalid Rust if rules with different
priorities both match against a multi-term. We first documented this
symptom in #5647. As far as I can figure, prohibiting rule priorities
prevents all possible instances of that bug.
At some point in the future we might decide we want to carefully define
semantics for multi-term result ordering, at which point we can revisit
this.
ISLE's existing code-generation strategy doesn't generate the most
efficient matching order for rules. This PR completely replaces it.
With this PR applied, wasmtime compile retires 2% fewer instructions on
the pulldown-cmark and spidermonkey benchmarks from Sightglass.
A dev build of cranelift-codegen from an empty target/ directory takes
2% less time. The build script, invoking ISLE, takes a little longer,
but Rust can compile the generated code faster, so it balances out.
...not just the ones at the outer scope of a rule.
Thanks to @avanhatt for pointing out that #5221 didn't capture as much
information as I intended it to.
* Multi-extractors should only be used in multi-terms
* ISLE int literals should be in range for their type
See #5431 and #5423.
* Make StableSet usable in public interfaces
Also implement an immutable version of DisjointSets::find_mut.
* Return analyzed terms from overlap check
If the caller wants the `trie_again::RuleSet` for a term, don't make
them recompute it.
* Expose binding lookups and sources
* Don't dedup or prune impure constructor calls
* Record int types for bindings and constraints
This means that bindings for constant integers that have the same value
but not the same type no longer hash-cons into the same binding ID.
* Track binding sites from calling multi-terms
* Implement more traits
It turns out that during codegen I'll want to know which bindings were
added for a particular constraint. Factoring that out and making sure to
use it everywhere that constraints and bindings are created ensures that
these will always stay in sync. It also simplifies the implementation of
`normalize_equivalence_classes`, which needs to create bindings for
constraints but doesn't care what they are.
Also make `add_pattern_constraints` non-recursive and reuse allocations.
* cranelift-isle: Add "partial" flag for constructors
Instead of tying fallibility of constructors to whether they're either
internal or pure, this commit assumes all constructors are infallible
unless tagged otherwise with a "partial" flag.
Internal constructors without the "partial" flag are not allowed to use
constructors which have the "partial" flag on the right-hand side of any
rules, because they have no way to report last-minute match failures.
Multi-constructors should never be "partial"; they report match failures
with an empty iterator instead. In turn this means you can't use partial
constructors on the right-hand side of internal multi-constructor rules.
However, you can use the same constructors on the left-hand side with
`if` or `if-let` instead.
In many cases, ISLE can already trivially prove that an internal
constructor always returns `Some`. With this commit, those cases are
largely unchanged, except for removing all the `Option`s and `Some`s
from the generated code for those terms.
However, for internal non-partial constructors where ISLE could not
prove that, it now emits an `unreachable!` panic as the last-resort,
instead of returning `None` like it used to do. Among the existing
backends, here's how many constructors have these panic cases:
- x64: 14% (53/374)
- aarch64: 15% (41/277)
- riscv64: 23% (26/114)
- s390x: 47% (268/567)
It's often possible to rewrite rules so that ISLE can tell the panic can
never be hit. Just ensure that there's a lowest-priority rule which has
no constraints on the left-hand side.
But in many of these constructors, it's difficult to statically prove
the unhandled cases are unreachable because that's only down to
knowledge about how they're called or other preconditions.
So this commit does not try to enforce that all terms have a last-resort
fallback rule.
* Check term flags while translating expressions
Instead of doing it in a separate pass afterward.
This involved threading all the term flags (pure, multi, partial)
through the recursive `translate_expr` calls, so I extracted the flags
to a new struct so they can all be passed together.
* Validate multi-term usage
Now that I've threaded the flags through `translate_expr`, it's easy to
check this case too, so let's just do it.
* Extract `ReturnKind` to use in `ExternalSig`
There are only three legal states for the combination of `multi` and
`infallible`, so replace those fields of `ExternalSig` with a
three-state enum.
* Remove `Option` wrapper from multi-extractors too
If we'd had any external multi-constructors this would correct their
signatures as well.
* Update ISLE tests
* Tag prelude constructors as pure where appropriate
I believe the only reason these weren't marked `pure` before was because
that would have implied that they're also partial. Now that those two
states are specified separately we apply this flag more places.
* Fix my changes to aarch64 `lower_bmask` and `imm` terms
* Add release notes for 3.0.1
* Update some version directives for crates in Wasmtime
* Mark anything with `publish = false` as version 0.0.0
* Mark the icache coherence crate with the same version as Wasmtime
* Fix manifest directives
Some of our ISLE rules can never fire because there's a higher-priority
rule that will always fire instead.
Sometimes the worst that can happen is we generate sub-optimal output.
That's not so bad but we'd still like to know about it so we can fix it.
In other cases there might be instructions which can't be lowered in
isolation. If a general rule for lowering one of the instructions is
higher-priority than the rule for lowering the combined sequence, then
lowering the combined sequence will always fail.
Either way, this is always a bug, so make it a fatal error if we can
detect it.
Ulrich Weigand identified two bugs in this code due to it falsely
claiming there were unreachable rules in the s390x backend. The fixes
are:
- Add constraints for pure constructors.
I didn't notice that a constructor which is declared pure (which
currently implies that it is fallible), when used on the left-hand side
of a rule, can cause the rule to fail to match. Therefore, any
constructors on the left-hand side must be noted as additional
constraints on the rule, so that overlap checking can see them.
- Ignore subset-overlaps for rules with equality constraints
This eliminates false positives when checking for unreachable rules. It
introduces false negatives instead but we prefer to fail to detect an
error instead of claiming that valid input is wrong. We can implement a
more accurate check later.
- Remove remaining references to Miette
- Borrow implementation of `line_starts` from codespan-reporting
- Clean up a use of `Result` that no longer conflicts with a local
definition
- When printing plain errors, add a blank line between errors for
readability
There were several issues with ISLE's existing error reporting
implementation.
- When using Miette for more readable error reports, it would panic if
errors were reported from multiple files in the same run.
- Miette is pretty heavy-weight for what we're doing, with a lot of
dependencies.
- The `Error::Errors` enum variant led to normalization steps in many
places, to avoid using that variant to represent a single error.
This commit:
- replaces Miette with codespan-reporting
- gets rid of a bunch of cargo-vet exemptions
- replaces the `Error::Errors` variant with a new `Errors` type
- removes source info from `Error` variants so they're easy to construct
- adds source info only when formatting `Errors`
- formats `Errors` with a custom `Debug` impl
- shares common code between ISLE's callers, islec and cranelift-codegen
- includes a source snippet even with fancy-errors disabled
I tried to make this a series of smaller commits but I couldn't find any
good split points; everything was too entangled with everything else.
In #5174 we decided it doesn't make sense for a rule to have a
bind-pattern at the root of its left-hand side. There's no Rust value
corresponding to the root value of such a term, because it actually
represents a function declaration with one or more arguments.
This commit takes that to its logical conclusion.
`sema::Rule` previously had an `lhs` field whose value must always be a
`Pattern::Term` variant, and anyone using that structure had to deal
with the possibility of finding the wrong variant there.
Now the relevant fields from that variant are stored directly in `Rule`
instead. Also, the (tiny!) portion of `translate_pattern` which applied
when the pattern was the root term is now inlined in `collect_rules`.
Because `translate_pattern` no longer has to special-case the root term,
we can delete its `rule_term` and `is_root` arguments. That brings it
down to a more manageable four arguments, which means many calls fit on
one line now.
As it turns out, that distinction was not necessary for this
representation. Removing it eliminates some complexity around wrapping
expressions as bindings and vice versa. It also clears up some confusion
about which category to put certain constructs in (arguments and
extractors) by refusing to have different categories.
While I was writing this patch I also realized that `add_match_variant`
and `normalize_equivalence_classes` both need to do fundamentally the
same things with enum variants, so I refactored them to share code and
make their relationship clearer.
Finally, I reviewed all the comments in this file and fixed some places
where they could be more clear.
* cranelift-isle: New IR and revised overlap checks
* Improve error reporting
* Avoid "unused argument" warnings a nicer way
* Remove unused fields
* Minimize diff and "fix" error handling
I had tried to use Miette "right" and made things worse somehow. Among
other changes, revert all my changes to unrelated parts of `error.rs`
and `error_miette.rs`.
* Review comments: Rename "unmatchable" to "unreachable"
* Review comments: newtype wrappers, not type aliases
* Review comments: more comments on overlap checks
* Review comments: Clarify `normalize_equivalence_classes`
* Review comments: use union-find instead of linked list
This saves about 50 lines of code in the trie_again module. The
union-find implementation is about twice as long as that, counting
comments and doc-tests, but that's a worth-while tradeoff.
However, this makes `normalize_equivalence_classes` slower, because now
finding all elements of an equivalence class takes time linear in the
total size of all equivalence classes. If that ever turns out to be a
problem in practice we can find some way to optimize `remove_set_of`.
* Review comments: Hide constraints HashMap
We want to enforce that consumers of this representation can't observe
non-deterministic ordering in any of its public types.
* Review comments: Normalize equivalence classes incrementally
I'm not sure whether this is a good idea. It doesn't make the logic
particularly simpler, and I think it will do more work if three or more
binding sites with enum-variant constraints get set equal to each other.
* More comments and other clarifications
* Revert "Review comments: Normalize equivalence classes incrementally"
* Even more comments
This is a common pattern in sema, so factor it out.
Since this version uses `intern` instead of `intern_mut`, it might be a
tiny bit faster when errors occur due to not writing names into maps
then. When no error occurs, ISLE should do exactly the same work with or
without this commit.
The `is_root` flag to `translate_pattern` just determines whether the
`rule_term` argument is used, which begs a larger cleanup. But that
cleanup is less clear if `is_root` is set anywhere aside from the call
in `collect_rules`. So I wanted to get confirmation that this particular
use of that flag is incorrect first.
These two arguments (`is_root` and `rule_term`) are used to prevent
expansion of a term as an internal extractor ("macro") if:
- that term is also an internal constructor
- and it's the root term on the left-hand side of the current rule
- and the pattern we're currently translating has no parents.
I'm not sure what it should mean to use the term you're currently
defining as the root pattern on the left-hand side of an if-let in the
same rule, but I don't think it should have this particular special
treatment.
It's nice to be able to report these names after sema analysis completes
so rule authors can recognize which names they used.
This isn't used anywhere yet, but I'm planning to use it during codegen,
and the rule-verification folks wanted something like this for debugging
output.
While checking the call graph of extractors during semantic validation,
save `TermId` instead of `Sym`. The types are both just integer indexes,
but the `TermId` is more useful here. Saving it avoids needing to check
for failed map lookups twice, which simplifies the implementation.
This makes some rather tricky analysis available to other users besides
the current IR. It shouldn't change current behavior, except if a rule
attempts to bind its root term to a name. There's no Rust value for a
root term, so the existing code silently ignored such bindings and would
panic saying "Variable should already be bound" if a rule attempted to
use such bindings. With this commit, the initial attempt to bind the
name reports the error instead.
One big change here is to stop using `Term::extractor_sig`, which was
the only call that used a `TypeEnv`. However that function only uses
type information to construct the fully-qualified name of the extractor,
which is not used when building the IR. So removing it and removing the
now-unused `typeenv` parameters removes all uses of `TypeEnv` from the
`ir` and `trie` modules.
In addition, this completes the changes started in "More consistent use
of `add_inst`" (e63771f2d9), by always
using `add_inst` to get an `InstId`.
I also removed a number of unnecessary intermediate allocations.
Now that we aren't trying to do overlap checking in parallel, we can
fuse the loop that generates a list of rule pairs with the loop that
checks those pairs.
Removing the intermediate vector of pairs should save a little time and
memory. But it also means we're no longer borrowing from the `by_term`
HashMap, so we can use `into_iter` instead of `values` to move ownership
out of the map. That in turn means that we can use `into_iter` on each
vector of rules as well, which turns out to offer a slightly nicer idiom
for looping over all pairs, and also means we drop allocations as soon
as possible.
I also pushed grouping by priority earlier, so the O(n^2) all-pairs loop
runs over smaller lists. If we later find we want to know about overlaps
across different priorities, the definition of the map key is an easy
place to make that change.
Using rayon adds a lot of dependencies to Cranelift. The total
unparallelized time the code that uses rayon takes is less than half a
second and it runs at compile time, so there is pretty much no benefit
to parallelizing it.
* egraph-based midend: draw the rest of the owl.
* Rename `egg` submodule of cranelift-codegen to `egraph`.
* Apply some feedback from @jsharp during code walkthrough.
* Remove recursion from find_best_node by doing a single pass.
Rather than recursively computing the lowest-cost node for a given
eclass and memoizing the answer at each eclass node, we can do a single
forward pass; because every eclass node refers only to earlier nodes,
this is sufficient. The behavior may slightly differ from the earlier
behavior because we cannot short-circuit costs to zero once a node is
elaborated; but in practice this should not matter.
* Make elaboration non-recursive.
Use an explicit stack instead (with `ElabStackEntry` entries,
alongside a result stack).
* Make elaboration traversal of the domtree non-recursive/stack-safe.
* Work analysis logic in Cranelift-side egraph glue into a general analysis framework in cranelift-egraph.
* Apply static recursion limit to rule application.
* Fix aarch64 wrt dynamic-vector support -- broken rebase.
* Topo-sort cranelift-egraph before cranelift-codegen in publish script, like the comment instructs me to!
* Fix multi-result call testcase.
* Include `cranelift-egraph` in `PUBLISHED_CRATES`.
* Fix atomic_rmw: not really a load.
* Remove now-unnecessary PartialOrd/Ord derivations.
* Address some code-review comments.
* Review feedback.
* Review feedback.
* No overlap in mid-end rules, because we are defining a multi-constructor.
* rustfmt
* Review feedback.
* Review feedback.
* Review feedback.
* Review feedback.
* Remove redundant `mut`.
* Add comment noting what rules can do.
* Review feedback.
* Clarify comment wording.
* Update `has_memory_fence_semantics`.
* Apply @jameysharp's improved loop-level computation.
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Fix suggestion commit.
* Fix off-by-one in new loop-nest analysis.
* Review feedback.
* Review feedback.
* Review feedback.
* Use `Default`, not `std::default::Default`, as per @fitzgen
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Apply @fitzgen's comment elaboration to a doc-comment.
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Add stat for hitting the rewrite-depth limit.
* Some code motion in split prelude to make the diff a little clearer wrt `main`.
* Take @jameysharp's suggested `try_into()` usage for blockparam indices.
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Take @jameysharp's suggestion to avoid double-match on load op.
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Fix suggestion (add import).
* Review feedback.
* Fix stack_load handling.
* Remove redundant can_store case.
* Take @jameysharp's suggested improvement to FuncEGraph::build() logic
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Tweaks to FuncEGraph::build() on top of suggestion.
* Take @jameysharp's suggested clarified condition
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Clean up after suggestion (unused variable).
* Fix loop analysis.
* loop level asserts
* Revert constant-space loop analysis -- edge cases were incorrect, so let's go with the simple thing for now.
* Take @jameysharp's suggestion re: result_tys
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Fix up after suggestion
* Take @jameysharp's suggestion to use fold rather than reduce
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Fixup after suggestion
* Take @jameysharp's suggestion to remove elaborate_eclass_use's return value.
* Clarifying comment in terminator insts.
Co-authored-by: Jamey Sharp <jamey@minilop.net>
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
Resolve overlap in the ISLE prelude and the x64 inst module by introducing new types that allow better sharing of extractor resuls, or falling back on priorities.
* Leverage Cargo's workspace inheritance feature
This commit is an attempt to reduce the complexity of the Cargo
manifests in this repository with Cargo's workspace-inheritance feature
becoming stable in Rust 1.64.0. This feature allows specifying fields in
the root workspace `Cargo.toml` which are then reused throughout the
workspace. For example this PR shares definitions such as:
* All of the Wasmtime-family of crates now use `version.workspace =
true` to have a single location which defines the version number.
* All crates use `edition.workspace = true` to have one default edition
for the entire workspace.
* Common dependencies are listed in `[workspace.dependencies]` to avoid
typing the same version number in a lot of different places (e.g. the
`wasmparser = "0.89.0"` is now in just one spot.
Currently the workspace-inheritance feature doesn't allow having two
different versions to inherit, so all of the Cranelift-family of crates
still manually specify their version. The inter-crate dependencies,
however, are shared amongst the root workspace.
This feature can be seen as a method of "preprocessing" of sorts for
Cargo manifests. This will help us develop Wasmtime but shouldn't have
any actual impact on the published artifacts -- everything's dependency
lists are still the same.
* Fix wasi-crypto tests
* ISLE: add support for multi-extractors and multi-constructors.
This support allows for rules that process multiple matching values per
extractor call on the left-hand side, and as a result, can produce
multiple values from the constructor whose body they define.
This is useful in situations where we are matching on an input data
structure that can have multiple "nodes" for a given value or ID, for
example in an e-graph.
* Review feedback: all multi-ctors and multi-etors return iterators; no `Vec` case.
* Add additional warning suppressions to generated-code toplevels to be consistent with new islec output.
This commit replaces #4869 and represents the actual version bump that
should have happened had I remembered to bump the in-tree version of
Wasmtime to 1.0.0 prior to the branch-cut date. Alas!
This was likely a copy-paste from the `ast::Pattern` case, but here it
is checking a term name in `ast::Expr` and so should say "... in
expression", not "... in pattern".