Ulrich Weigand identified two bugs in this code due to it falsely
claiming there were unreachable rules in the s390x backend. The fixes
are:
- Add constraints for pure constructors.
I didn't notice that a constructor which is declared pure (which
currently implies that it is fallible), when used on the left-hand side
of a rule, can cause the rule to fail to match. Therefore, any
constructors on the left-hand side must be noted as additional
constraints on the rule, so that overlap checking can see them.
- Ignore subset-overlaps for rules with equality constraints
This eliminates false positives when checking for unreachable rules. It
introduces false negatives instead but we prefer to fail to detect an
error instead of claiming that valid input is wrong. We can implement a
more accurate check later.
- Remove remaining references to Miette
- Borrow implementation of `line_starts` from codespan-reporting
- Clean up a use of `Result` that no longer conflicts with a local
definition
- When printing plain errors, add a blank line between errors for
readability
Fix shadowing identified in #5322 for imul and swiden_high/swiden_low/uwiden_high/uwiden_low combinations in the x64 backend, and remove some redundant rules from the aarch64 dynamic neon ruleset. Additionally, add tests to the x64 backend showing that the imul specializations are firing.
Add assertions to the OperandCollector that show we're not using pinned vregs, and use reg_fixed_nonallocatable constraints when a real register is used with other constraint generation functions like reg_use etc.
There were several issues with ISLE's existing error reporting
implementation.
- When using Miette for more readable error reports, it would panic if
errors were reported from multiple files in the same run.
- Miette is pretty heavy-weight for what we're doing, with a lot of
dependencies.
- The `Error::Errors` enum variant led to normalization steps in many
places, to avoid using that variant to represent a single error.
This commit:
- replaces Miette with codespan-reporting
- gets rid of a bunch of cargo-vet exemptions
- replaces the `Error::Errors` variant with a new `Errors` type
- removes source info from `Error` variants so they're easy to construct
- adds source info only when formatting `Errors`
- formats `Errors` with a custom `Debug` impl
- shares common code between ISLE's callers, islec and cranelift-codegen
- includes a source snippet even with fancy-errors disabled
I tried to make this a series of smaller commits but I couldn't find any
good split points; everything was too entangled with everything else.
* Cranelift: Define `heap_load` and `heap_store` instructions
* Cranelift: Implement interpreter support for `heap_load` and `heap_store`
* Cranelift: Add a suite runtests for `heap_{load,store}`
There are so many knobs we can twist for heaps and I wanted to exhaustively test
all of them, so I wrote a script to generate the tests. I've checked in the
script in case we want to make any changes in the future, but I don't think it
is worth adding this to CI to check that scripts are up to date or anything like
that.
* Review feedback
In #5174 we decided it doesn't make sense for a rule to have a
bind-pattern at the root of its left-hand side. There's no Rust value
corresponding to the root value of such a term, because it actually
represents a function declaration with one or more arguments.
This commit takes that to its logical conclusion.
`sema::Rule` previously had an `lhs` field whose value must always be a
`Pattern::Term` variant, and anyone using that structure had to deal
with the possibility of finding the wrong variant there.
Now the relevant fields from that variant are stored directly in `Rule`
instead. Also, the (tiny!) portion of `translate_pattern` which applied
when the pattern was the root term is now inlined in `collect_rules`.
Because `translate_pattern` no longer has to special-case the root term,
we can delete its `rule_term` and `is_root` arguments. That brings it
down to a more manageable four arguments, which means many calls fit on
one line now.
As it turns out, that distinction was not necessary for this
representation. Removing it eliminates some complexity around wrapping
expressions as bindings and vice versa. It also clears up some confusion
about which category to put certain constructs in (arguments and
extractors) by refusing to have different categories.
While I was writing this patch I also realized that `add_match_variant`
and `normalize_equivalence_classes` both need to do fundamentally the
same things with enum variants, so I refactored them to share code and
make their relationship clearer.
Finally, I reviewed all the comments in this file and fixed some places
where they could be more clear.
* Shrink the size of SigData in Cranelift
* Update cranelift/codegen/src/machinst/abi.rs
Co-authored-by: Jamey Sharp <jamey@minilop.net>
* Change ret arg length to u16
* Add test
Co-authored-by: Jamey Sharp <jamey@minilop.net>
Avoid naming %rcx as written by the CoffTlsGetAddr pseudo-instruction in the x64 backend, and instead emit a fixed-def constraint for a fresh VReg and %rcx.
Remove some unnecessary moves in the x64 gen_memcpy implementation -- the call instruction that's generated will already constrain the args to those registers.
* cranelift: Cleanup `fdemote`/`fpromote` tests
* cranelift: Fix `fdemote`/`fpromote` instruction docs
The verifier fails if the input and output types are the same
for these instructions
* cranelift: Fix `fdemote`/`fpromote` in the interpreter
* fuzzgen: Add `fdemote`/`fpromote`
We can encode more constants into 12-bit immediates if we do the following
rewrite for comparisons with odd constants:
A >= B + 1
==> A - 1 >= B
==> A > B
* cranelift-isle: New IR and revised overlap checks
* Improve error reporting
* Avoid "unused argument" warnings a nicer way
* Remove unused fields
* Minimize diff and "fix" error handling
I had tried to use Miette "right" and made things worse somehow. Among
other changes, revert all my changes to unrelated parts of `error.rs`
and `error_miette.rs`.
* Review comments: Rename "unmatchable" to "unreachable"
* Review comments: newtype wrappers, not type aliases
* Review comments: more comments on overlap checks
* Review comments: Clarify `normalize_equivalence_classes`
* Review comments: use union-find instead of linked list
This saves about 50 lines of code in the trie_again module. The
union-find implementation is about twice as long as that, counting
comments and doc-tests, but that's a worth-while tradeoff.
However, this makes `normalize_equivalence_classes` slower, because now
finding all elements of an equivalence class takes time linear in the
total size of all equivalence classes. If that ever turns out to be a
problem in practice we can find some way to optimize `remove_set_of`.
* Review comments: Hide constraints HashMap
We want to enforce that consumers of this representation can't observe
non-deterministic ordering in any of its public types.
* Review comments: Normalize equivalence classes incrementally
I'm not sure whether this is a good idea. It doesn't make the logic
particularly simpler, and I think it will do more work if three or more
binding sites with enum-variant constraints get set equal to each other.
* More comments and other clarifications
* Revert "Review comments: Normalize equivalence classes incrementally"
* Even more comments
This is a common pattern in sema, so factor it out.
Since this version uses `intern` instead of `intern_mut`, it might be a
tiny bit faster when errors occur due to not writing names into maps
then. When no error occurs, ISLE should do exactly the same work with or
without this commit.
Before, we would do a `heap_addr` to translate the given Wasm memory address
into a native memory address and pass it into the libcall that implemented the
atomic operation, which would then treat the address as a Wasm memory address
and pass it to `validate_atomic_addr` to be bounds checked a second time. This
is a bit nonsensical, as we are validating a native memory address as if it were
a Wasm memory address.
Now, we no longer do a `heap_addr` to translate the Wasm memory address to a
native memory address. Instead, we pass the Wasm memory address to the libcall,
and the libcall is responsible for doing the bounds check (by calling
`validate_atomic_addr` with the correct type of memory address now).
The `is_root` flag to `translate_pattern` just determines whether the
`rule_term` argument is used, which begs a larger cleanup. But that
cleanup is less clear if `is_root` is set anywhere aside from the call
in `collect_rules`. So I wanted to get confirmation that this particular
use of that flag is incorrect first.
These two arguments (`is_root` and `rule_term`) are used to prevent
expansion of a term as an internal extractor ("macro") if:
- that term is also an internal constructor
- and it's the root term on the left-hand side of the current rule
- and the pattern we're currently translating has no parents.
I'm not sure what it should mean to use the term you're currently
defining as the root pattern on the left-hand side of an if-let in the
same rule, but I don't think it should have this particular special
treatment.
It's nice to be able to report these names after sema analysis completes
so rule authors can recognize which names they used.
This isn't used anywhere yet, but I'm planning to use it during codegen,
and the rule-verification folks wanted something like this for debugging
output.
* Cranelift: Make `heap_addr` return calculated `base + index + offset`
Rather than return just the `base + index`.
(Note: I've chosen to use the nomenclature "index" for the dynamic operand and
"offset" for the static immediate.)
This move the addition of the `offset` into `heap_addr`, instead of leaving it
for the subsequent memory operation, so that we can Spectre-guard the full
address, and not allow speculative execution to read the first 4GiB of memory.
Before this commit, we were effectively doing
load(spectre_guard(base + index) + offset)
Now we are effectively doing
load(spectre_guard(base + index + offset))
Finally, this also corrects `heap_addr`'s documented semantics to say that it
returns an address that will trap on access if `index + offset + access_size` is
out of bounds for the given heap, rather than saying that the `heap_addr` itself
will trap. This matches the implemented behavior for static memories, and after
https://github.com/bytecodealliance/wasmtime/pull/5190 lands (which is blocked
on this commit) will also match the implemented behavior for dynamic memories.
* Update heap_addr docs
* Factor out `offset + size` to a helper
While checking the call graph of extractors during semantic validation,
save `TermId` instead of `Sym`. The types are both just integer indexes,
but the `TermId` is more useful here. Saving it avoids needing to check
for failed map lookups twice, which simplifies the implementation.
Modify return pseudo-instructions to have pairs of registers: virtual and real. This allows us to constrain the virtual registers to the real ones specified by the abi, instead of directly emitting moves to those real registers.
* Wasmtime+Cranelift: strip out some dead x86-32 code.
I was recently pointed to fastly/Viceroy#200 where it seems some folks
are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the
failures may not be loud enough. I've tried to reproduce this
cross-compiling to i686-pc-windows-gnu from Linux and hit build failures
(as expected) in several places. Nevertheless, while trying to discern
what others may be attempting, I noticed some dead x86-32-specific code
in our repo, and figured it would be a good idea to clean this up.
Otherwise, it (i) sends some mixed messages -- "hey look, this codebase
does support x86-32" -- and (ii) keeps untested code around, which is
generally not great.
This PR removes x86-32-specific cases in traphandlers and unwind code,
and Cranelift's native feature detection. It adds helpful compile-error
messages in a few cases. If we ever support x86-32 (contributors
welcome! The big missing piece is Cranelift support; see #1980), these
compile errors and git history should be enough to recover any knowledge
we are now encoding in the source.
I left the x86-32 support in `wasmtime-fiber` alone because that seems
like a bit of a special case -- foundation library, separate from the
rest of Wasmtime, with specific care to provide a (presumably working)
full 32-bit version.
* Remove some extraneous compile_error!s, already covered by others.
Remove the dependency on VCode for VReg allocation. This will simplify the changes in #5172, as that PR introduces the need to allocate temporary registers from the ABI context.
This change also allows us to remove some fields from VCode: reftyped_vregs_set and have_ref_values.
Add a MemFlags operand to the bitcast instruction, where only the
`big` and `little` flags are accepted. These define the lane order
to be used when casting between types of different lane counts.
Update all users to pass an appropriate MemFlags argument.
Implement lane swaps where necessary in the s390x back-end.
This is the final part necessary to fix
https://github.com/bytecodealliance/wasmtime/issues/4566.