This adds support for `.wat` tests in `cranelift-filetest`. The test runner
translates the WAT to Wasm and then uses `cranelift-wasm` to translate the Wasm
to CLIF.
These tests are always precise output tests. The test expectations can be
updated by running tests with the `CRANELIFT_TEST_BLESS=1` environment variable
set, similar to our compile precise output tests. The test's expected output is
contained in the last comment in the test file.
The tests allow for configuring the kinds of heaps used to implement Wasm linear
memory via TOML in a `;;!` comment at the start of the test.
To get ISA and Cranelift flags parsing available in the filetests crate, I had
to move the `parse_sets_and_triple` helper from the `cranelift-tools` binary
crate to the `cranelift-reader` crate, where I think it logically
fits.
Additionally, I had to make some more bits of `cranelift-wasm`'s dummy
environment `pub` so that I could properly wrap and compose it with the
environment used for the `.wat` tests. I don't think this is a big deal, but if
we eventually want to clean this stuff up, we can probably remove the dummy
environments completely, remove `translate_module`, and fold them into these new
test environments and test runner (since Wasmtime isn't using those things
anyways).
This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime.
After the suggestion of Chris, `Function` has been split into mostly two parts:
- on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`.
- on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on.
Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache:
- most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set.
- user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`.
- some refactorings have been made for function names:
- `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name.
- The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with.
The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions.
A basic fuzz target has been introduced that tries to do the bare minimum:
- check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function.
- check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache.
- This last check is less efficient and less likely to happen, so probably should be rethought a bit.
Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip.
Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement.
Fixes#4155.
* Cranellift: remove Baldrdash support and related features.
As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team
has recently determined that their current form of integration with
Cranelift is too hard to maintain, and they have chosen to remove it
from their codebase. If and when they decide to build updated support
for Cranelift, they will adopt different approaches to several details
of the integration.
In the meantime, after discussion with the SpiderMonkey folks, they
agree that it makes sense to remove the bits of Cranelift that exist
to support the integration ("Baldrdash"), as they will not need
them. Many of these bits are difficult-to-maintain special cases that
are not actually tested in Cranelift proper: for example, the
Baldrdash integration required Cranelift to emit function bodies
without prologues/epilogues, and instead communicate very precise
information about the expected frame size and layout, then stitched
together something post-facto. This was brittle and caused a lot of
incidental complexity ("fallthrough returns", the resulting special
logic in block-ordering); this is just one example. As another
example, one particular Baldrdash ABI variant processed stack args in
reverse order, so our ABI code had to support both traversal
orders. We had a number of other Baldrdash-specific settings as well
that did various special things.
This PR removes Baldrdash ABI support, the `fallthrough_return`
instruction, and pulls some threads to remove now-unused bits as a
result of those two, with the understanding that the SpiderMonkey folks
will build new functionality as needed in the future and we can perhaps
find cleaner abstractions to make it all work.
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425
* Review feedback.
* Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations.
The debugger tests invoke `wasmtime` from within each test case under
the control of a debugger (gdb or lldb). Some of these tests started to
inexplicably fail in CI with unrelated changes, and the failures were
only inconsistently reproducible locally. It seems to be cache related:
if we disable cached compilation on the nested `wasmtime` invocations,
the tests consistently pass.
* Review feedback.
* Move `emit_to_memory` to `MachCompileResult`
This small refactoring makes it clearer to me that emitting to memory
doesn't require anything else from the compilation `Context`. While it's
a trivial change, it's a small public API change that shouldn't cause
too much trouble, and doesn't seem RFC-worthy. Happy to hear different
opinions about this, though!
* hide the MachCompileResult behind a method
* Add a `CompileError` wrapper type that references a `Function`
* Rename MachCompileResult to CompiledCode
* Additionally remove the last unsafe API in cranelift-codegen
Rather than sometimes using `file-per-thread-logger`.
Also remove the debug CLI flags, so that we can always just define
`RUST_LOG=...` to get logging and don't need to also do other things.
* Update to clap 3.0
This commit migrates all CLI commands internally used in this project
from structopt/clap2 to clap 3. The intent here is to ensure that we're
using maintained versions of the dependencies as structopt and clap 2
are less maintained nowadays. Most transitions were pretty
straightforward and mostly dealing with structopt/clap3 differences.
* Fix a number of `cargo deny` errors
This commit fixes a few errors around duplicate dependencies which
arose from the prior update to clap3. This also uses a new feature in
`deny.toml`, `skip-tree`, which allows having a bit more targeted
ignores for skips of duplicate version checks. This showed a few more
locations in Wasmtime itself where we could update some dependencies.
This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate.
Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!
On @fitzgen's suggestion, this change adds a `--color` option for controlling whether the `clif-util` output prints with ANSI color escape sequences. Only `clif-util wasm ...` currently uses this new option. The option has three variants:
- `--color auto`, the default, prints colors if the terminal supports them
- `--color always` prints colors always
- `--color never` never prints colors
`clif-util wasm ...` has functionality to print extra information in color when verbose mode (`-v`) is specified. Previously, the ANSI color escape sequences were printed regardless of whether `-v` was used so that users that captured output of this command would have to remove escape sequences from their capture files. With this change, `clif-util wasm ...` will only print the ANSI color escape sequences when `-v` is used.
Adds support for addss and subss. This is the first lowering for
sse floating point alu and some move operations. The changes here do
some renaming of data structures and adds a couple of new ones
to support sse specific operations. The work done here will likely
evolve as needed to support an efficient, inituative, and consistent
framework.
This commit moves the cranelift tests and tools from the `wabt` crate on
crates.io (which compiles the wabt C++ codebase) to the `wat` crate on
crates.io which is a Rust parser for the `*.wat` format. This was
motivated by me noticing that release builds on Windows are ~5 minutes
longer than Linux builds, and local timing graphs showed that `wabt-sys`
was by far the longest build step in the build process.
This commit changes the `clif-util` binary where the `--enable-simd`
flag is no longer respected with the text format as input, since the
`wat` crate has no feature gating. This was already sort of not
respected, though, since `--enable-simd` wasn't consulted for binary
inputs which `clif-util` supports as well. If this isn't ok though then
it should be ok to close this PR!
This commit introduces initial support for multi-value Wasm. Wasm blocks and
calls can now take and return an arbitrary number of values.
The encoding for multi-value blocks means that we need to keep the contents of
the "Types" section around when translating function bodies. To do this, we
introduce a `WasmTypesMap` type that maps the type indices to their parameters
and returns, construct it when parsing the "Types" section, and shepherd it
through a bunch of functions and methods when translating function bodies.
-Add resumable_trap, safepoint, isnull, and null instructions
-Add Stackmap struct and StackmapSink trait
Co-authored-by: Mir Ahmed <mirahmed753@gmail.com>
Co-authored-by: Dan Gohman <sunfish@mozilla.com>
The result of the emitter is a vector of bytes holding machine code,
jump tables, and (in the future) other read-only data. Some clients,
notably Firefox's Wasm compiler, needs to separate the machine code
from the data in order to insert more code directly after the code
generated by Cranelift.
To make such separation possible, we record more information about the
emitted bytes: the sizes of each of the sections of code, jump tables,
and read-only data, as well as the locations within the code that
reference (PC-relatively) the jump tables and read-only data.
- Both the `wasm` and `compile` commands get this new subcommand, and it defaults to false. This means that test runs with `wasm` can request disassembly (the main reason I am doing this) while test runs with `compile` now must request it, this changes current behavior.
- Switch to using context.compile_and_emit directly, and make the reloc and trap printers just accumulate output, not print it. This allows us to factor the printing code into the disasm module.
Remove some unneeded functions, and remove the `GlobalInit` special case
for data and elem initializer offsets; implementations that want that
information can provide it for themselves.
* initial cargo fix run
* Upgrade cranelift-entity crate
* Upgrade bforest crate
* Upgrade the codegen crate
* Upgrade the faerie crate
* Upgrade the filetests crate
* Upgrade the codegen-meta crate
* Upgrade the frontend crate
* Upgrade the cranelift-module crate
* Upgrade the cranelift-native crate
* Upgrade the cranelift-preopt crate
* Upgrade the cranelift-reader crate
* Upgrade the cranelift-serde crate
* Upgrade the cranelift-simplejit crate
* Upgrade the cranelift or cranelift-umbrella crate
* Upgrade the cranelift-wasm crate
* Upgrade cranelift-tools crate
* Use new import style on remaining files
* run format-all.sh
* run test-all.sh, update Readme and travis ci configuration
fixed an AssertionError also
* Remove deprecated functions
* Introduce a `TargetFrontendConfig` type.
`TargetFrontendConfig` is information specific to the target which is
provided to frontends to allow them to produce Cranelift IR for the
target. Currently this includes the pointer size and the default calling
convention.
The default calling convention is now inferred from the target, rather
than being a setting. cranelift-native is now just a provider of target
information, rather than also being a provider of settings, which gives
it a clearer role.
And instead of having cranelift-frontend routines require the whole
`TargetIsa`, just require the `TargetFrontendConfig`, and add a way to
get the `TargetFrontendConfig` from a `Module`.
Fixes#529.
Fixes#555.
* Move `return_at_end` out of Settings and into the wasm FuncEnvironment.
The `return_at_end` flag supports users that want to append a custom
epilogue to Cranelift-produced functions. It arranges for functions to
always return via a single return statement at the end, and users are
expected to remove this return to append their code.
This patch makes two changes:
- First, introduce a `fallthrough_return` instruction and use that
instead of adding a `return` at the end. That's simpler than having
users remove the `return` themselves.
- Second, move this setting out of the Settings and into the wasm
FuncEnvironment. This flag isn't something the code generator uses,
it's something that the wasm translator uses. The code generator
needs to preserve the property, however we can give the
`fallthrough_return` instruction properties to ensure this as needed,
such as marking it non-cloneable.
This switches from a custom list of architectures to use the
target-lexicon crate.
- "set is_64bit=1; isa x86" is replaced with "target x86_64", and
similar for other architectures, and the `is_64bit` flag is removed
entirely.
- The `is_compressed` flag is removed too; it's no longer being used to
control REX prefixes on x86-64, ARM and Thumb are separate
architectures in target-lexicon, and we can figure out how to
select RISC-V compressed encodings when we're ready.