Addresses #3809: when we are asked to create a Cranelift backend with
shared flags that indicate support for SIMD, we should check that the
ISA level needed for our SIMD lowerings is present.
In #3721, we have been discussing what to do about the ARM32 backend in
Cranelift. Currently, this backend supports only 32-bit types, which is
insufficient for full Wasm-MVP; it's missing other critical bits, like
floating-point support; and it has only ever been exercised, AFAIK, via
the filetests for the individual CLIF instructions that are implemented.
We were very very thankful for the original contribution of this
backend, even in its partial state, and we had hoped at the time that we
could eventually mature it in-tree until it supported e.g. Wasm and
other use-cases. But that hasn't yet happened -- to the blame of no-one,
to be clear, we just haven't had a contributor with sufficient time.
Unfortunately, the existence of the backend and lack of active
maintainer now potentially pose a bit of a burden as we hope to make
continuing changes to the backend framework. For example, the ISLE
migration, and the use of regalloc2 that it will allow, would need all
of the existing lowering patterns in the hand-written ARM32 backend to
be rewritten as ISLE rules.
Given that we don't currently have the resources to do this, we think
it's probably best if we, sadly, for now remove this partial backend.
This is not in any way a statement of what we might accept in the
future, though. If, in the future, an ARM32 backend updated to our
latest codebase with an active maintainer were to appear, we'd be happy
to merge it (and likewise for any other architecture!). But for now,
this is probably the best path. Thanks again to the original contributor
@jmkrauz and we hope that this work can eventually be brought back and
reused if someone has the time to do so!
* Update lots of `isa/*/*.clif` tests to `precise-output`
This commit goes through the `aarch64` and `x64` subdirectories and
subjectively changes tests from `test compile` to add `precise-output`.
This then auto-updates all the test expectations so they can be
automatically instead of manually updated in the future. Not all tests
were migrated, largely subject to the whims of myself, mainly looking to
see if the test was looking for specific instructions or just checking
the whole assembly output.
* Filter out `;;` comments from test expctations
Looks like the cranelift parser picks up all comments, not just those
trailing the function, so use a convention where `;;` is used for
human-readable-comments in test cases and `;`-prefixed comments are the
test expectation.
* cranelift: Add ability to auto-update test expectations
One of the problems of the current `*.clif` testing is that the files
are difficult to update when widespread changes are made (such as
removing modification of the frame pointer). Additionally when changing
register allocation or similar it can cause a large number of changes in
tests but the tests themselves didn't actually break. For this reason
this commit adds the ability to automatically update test expectations.
The idea behind this commit is that tests of the form `test compile` can
also optionally be flagged with the `precise-output` flag:
test compile precise-output
and when doing so the compiled form of each function is asserted to 100%
match the following comments and their test expectations. If a match is
not found then a `BLESS=1` environment variable can be used to
automatically rewrite the test file itself with the correct assertion.
If the environment variable isn't present and the expectation doesn't
match then the test fails.
It's hoped that, if approved, a follow-up commit can add
`precise-output` to all current `test compile` tests (or make it the
default) and all tests can be mass-updated. When developing locally test
expectations need not be written and instead tests can be run with
`BLESS=1` and the output can be manually verified. The environment
variable will not be present on CI which means that changes to the
output which don't also change the test expectation will cause CI to
fail. Furthermore this should still make updates to the test output
easily readable in review on CI because the test expectations are
intended to look the same as before.
Closes#1539
* Use raw vcode output in tests
* Fix a merge conflict
* Review comments
This commit translates the `rotl` and `rotr` lowerings already existing
to ISLE. The port was relatively straightforward with the biggest
changing being the instructions generated around i128 rotl/rotr
primarily due to register changes.
Peepmatic was an early attempt at a DSL for peephole optimizations, with the
idea that maybe sometime in the future we could user it for instruction
selection as well. It didn't really pan out, however:
* Peepmatic wasn't quite flexible enough, and adding new operators or snippets
of code implemented externally in Rust was a bit of a pain.
* The performance was never competitive with the hand-written peephole
optimizers. It was *very* size efficient, but that came at the cost of
run-time efficiency. Everything was table-based and interpreted, rather than
generating any Rust code.
Ultimately, because of these reasons, we never turned Peepmatic on by default.
These days, we just landed the ISLE domain-specific language, and it is better
suited than Peepmatic for all the things that Peepmatic was originally designed
to do. It is more flexible and easy to integrate with external Rust code. It is
has better time efficiency, meeting or even beating hand-written code. I think a
small part of the reason why ISLE excels in these things is because its design
was informed by Peepmatic's failures. I still plan on continuing Peepmatic's
mission to make Cranelift's peephole optimizer passes generated from DSL rewrite
rules, but using ISLE instead of Peepmatic.
Thank you Peepmatic, rest in peace!
With the old backends, this would log the lowered+legalized clif, but the log is
useles now with the new backends. Logging the disasm is the new moral
equivalent.
This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate.
Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!
* cranelift: Add stack support to the interpreter
We also change the approach for heap loads and stores.
Previously we would use the offset as the address to the heap. However,
this approach does not allow using the load/store instructions to
read/write from both the heap and the stack.
This commit changes the addressing mechanism of the interpreter. We now
return the real addresses from the addressing instructions
(stack_addr/heap_addr), and instead check if the address passed into
the load/store instructions points to an area in the heap or the stack.
* cranelift: Add virtual addresses to cranelift interpreter
Adds a Virtual Addressing scheme that was discussed as a better
alternative to returning the real addresses.
The virtual addresses are split into 4 regions (stack, heap, tables and
global values), and the address itself is composed of an `entry` field
and an `offset` field. In general the `entry` field corresponds to the
instance of the resource (e.g. table5 is entry 5) and the `offset` field
is a byte offset inside that entry.
There is one exception to this which is the stack, where due to only
having one stack, the whole address is an offset field.
The number of bits in entry vs offset fields is variable with respect to
the `region` and the address size (32bits vs 64bits). This is done
because with 32 bit addresses we would have to compromise on heap size,
or have a small number of global values / tables. With 64 bit addresses
we do not have to compromise on this, but we need to support 32 bit
addresses.
* cranelift: Remove interpreter trap codes
* cranelift: Calculate frame_offset when entering or exiting a frame
* cranelift: Add safe read/write interface to DataValue
* cranelift: DataValue write full 128bit slot for booleans
* cranelift: Use DataValue accessors for trampoline.
* cranelift: Add heap support to filetest infrastructure
* cranelift: Explicit heap pointer placement in filetest annotations
* cranelift: Add documentation about the Heap directive
* cranelift: Clarify that heap filetests pointers must be laid out sequentially
* cranelift: Use wrapping add when computing bound pointer
* cranelift: Better error messages when invalid signatures are found for heap file tests.
The tests for the SIMD floating-point maximum and minimum operations
require particular care because the handling of the NaN values is
non-deterministic and may vary between platforms. There is no way to
match several NaN values in a test, so the solution is to extract the
non-deterministic test cases into a separate file that is subsequently
replicated for every backend under test, with adjustments made to the
expected results.
Copyright (c) 2021, Arm Limited.
Cranelift crates have historically been much more verbose with debug-level
logging than most other crates in the Rust ecosystem. We log things like how
many parameters a basic block has, the color of virtual registers during
regalloc, etc. Even for Cranelift hackers, these things are largely only useful
when hacking specifically on Cranelift and looking at a particular test case,
not even when using some Cranelift embedding (such as Wasmtime).
Most of the time, when people want logging for their Rust programs, they do
something like:
RUST_LOG=debug cargo run
This means that they get all that mostly not useful debug logging out of
Cranelift. So they might want to disable logging for Cranelift, or change it to
a higher log level:
RUST_LOG=debug,cranelift=info cargo run
The problem is that this is already more annoying to type that `RUST_LOG=debug`,
and that Cranelift isn't one single crate, so you actually have to play
whack-a-mole with naming all the Cranelift crates off the top of your head,
something more like this:
RUST_LOG=debug,cranelift=info,cranelift_codegen=info,cranelift_wasm=info,...
Therefore, we're changing most of the `debug!` logs into `trace!` logs: anything
that is very Cranelift-internal, unlikely to be useful/meaningful to the
"average" Cranelift embedder, or prints a message for each instruction visited
during a pass. On the other hand, things that just report a one line statistic
for a whole pass, for example, are left as `debug!`. The more verbose the log
messages are, the higher the bar they must clear to be `debug!` rather than
`trace!`.
* cranelift: Initial fuzzer implementation
* cranelift: Generate multiple test cases in fuzzer
* cranelift: Separate function generator in fuzzer
* cranelift: Insert random instructions in fuzzer
* cranelift: Rename gen_testcase
* cranelift: Implement div for unsigned values in interpreter
* cranelift: Run all test cases in fuzzer
* cranelift: Comment options in function_runner
* cranelift: Improve fuzzgen README.md
* cranelift: Fuzzgen remove unused variable
* cranelift: Fuzzer code style fixes
Thanks! @bjorn3
* cranelift: Fix nits in CLIF fuzzer
Thanks @cfallin!
* cranelift: Implement Arbitrary for TestCase
* cranelift: Remove gen_testcase
* cranelift: Move fuzzers to wasmtime fuzz directory
* cranelift: CLIF-Fuzzer ignore tests that produce traps
* cranelift: CLIF-Fuzzer create new fuzz target to validate generated testcases
* cranelift: Store clif-fuzzer config in a separate struct
* cranelift: Generate variables upfront per function
* cranelift: Prevent publishing of fuzzgen crate
Enabling runtests for the s390x backend exposed a pre-existing endian bug with handling bool test case return values.
These are written as integers of the same width by the trampoline, but are always read out as the Rust "bool" type. This happens to work on little-endian systems, but fails for any boolean type larger than 1 byte on big-endian systems.
See: https://github.com/bytecodealliance/wasmtime/pull/2964#issuecomment-855879866
This removes an existing dependency on the byteorder crate in favor of
using std equivalents directly.
While not an issue for wasmtime per se, cranelift is now part of the
critical path of building and testing Rust, and minimizing dependencies,
even small ones, can help reduce the time and bandwidth required.
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.
The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.
As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.
CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.
At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
- Panic messages must now be string literals (we used `format!()` in
many places; `panic!()` can take format strings directly).
- Some dead enum options with EVEX encoding stuff in old x86 backend.
This will go away soon and/or be moved to the new backend anyway, so
let's silence the warning for now.
- A few other misc warnings.
* Rewrite interpreter generically
This change re-implements the Cranelift interpreter to use generic values; this makes it possible to do abstract interpretation of Cranelift instructions. In doing so, the interpretation state is extracted from the `Interpreter` structure and is accessed via a `State` trait; this makes it possible to not only more clearly observe the interpreter's state but also to interpret using a dummy state (e.g. `ImmutableRegisterState`). This addition made it possible to implement more of the Cranelift instructions (~70%, ignoring the x86-specific instructions).
* Replace macros with closures
As discussed in #2251, in order to be very confident that NaN signaling bits are correctly handled by the compiler, this switches `DataValue` to use Cranelift's `Ieee32` and `Ieee64` structures. This makes it a bit more inconvenient to interpreter Cranelift FP operations but this should change to something like `rustc_apfloat` in the future.
This adds a new feature experimental_x64 for CLIF tests.
A test is run in the new x64 backend iff:
- either the test doesn't have an x86_64 target requirement, signaling
it must be target agnostic or not run on this target.
- or the test does require the x86_64 target, and the test is marked
with the `experimental_x64` feature.
This required one workaround in the parser. The reason is that the
parser will try to use information not provided by the TargetIsa adapter
for the Mach backends, like register names. In particular, parsing test
may fail before the test runner realizes that the test must not be run.
In this case, we early return an almost-empty TestFile from the parser,
under the same conditions as above, so that the caller may filter out
the test properly.
This also copies two tests from the test suite using the new backend,
for demonstration purposes.
* clif-util: do not convert `anyhow::Error`s into strings into `anyhow::Error`s
* filetests: Use the debug formatting of `anyhow::Error`s
This provides the full error context, not just the source error's message.