- Added a filetest for the vcode output of lowering every handled FP opcode.
- Fixed two bugs that were discovered while going through the lowerings:
- Saturating FP->int operators would return `u{32,64}::MIN` rather than
`0` for a NaN input.
- `fcopysign` did not mask off the sign bit of the value whose sign is
overwritten.
These probably would have been caught by Wasm conformance tests soon
(and the validity of these lowerings will ultimately be tested this way)
but let's get them right by inspection, too!
- Undo temporary changes to default features (`all-arch`) and a
signal-handler test.
- Remove `SIGTRAP` handler: no longer needed now that we've found an
"undefined opcode" option on ARM64.
- Rename pp.rs to pretty_print.rs in machinst/.
- Only use empty stack-probe on non-x86. As per a comment in
rust-lang/compiler-builtins [1], LLVM only supports stack probes on
x86 and x86-64. Thus, on any other CPU architecture, we cannot refer
to `__rust_probestack`, because it does not exist.
- Rename arm64 to aarch64.
- Use `target` directive in vcode filetests.
- Run the flags verifier, but without encinfo, when using new backends.
- Clean up warning overrides.
- Fix up use of casts: use u32::from(x) and siblings when possible,
u32::try_from(x).unwrap() when not, to avoid silent truncation.
- Take immutable `Function` borrows as input; we don't actually
mutate the input IR.
- Lots of other miscellaneous cleanups.
[1] cae3e6ea23/src/probestack.rs (L39)
This patch ties together the new backend infrastructure with the
existing Cranelift codegen APIs.
With all patches in this series up to this patch applied, the ARM64
compiler is now functional and can be used. Two uses of this
functionality -- filecheck-based tests and integration into wasmtime --
will come in subsequent patches.
This patch adds the lowering implementation that translates Cranelift IR
(CLIF) function bodies to VCode<Inst>, i.e., ARM64 machine instructions.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
This patch also contains code written by Joey Gouly
<joey.gouly@arm.com> and contributed to the above branch. These
contributions are "Copyright (c) 2020, Arm Limited."
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
This patch provides an ARM64 implementation of the ABI-related traits
required by the new backend infrasturcture. It will be used by the
lowering code, when that is in place in a subsequent patch.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
This patch also contains code written by Joey Gouly
<joey.gouly@arm.com> and contributed to the above branch. These
contributions are "Copyright (c) 2020, Arm Limited."
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
This patch provides the bottom layer of the ARM64 backend: it defines
the `Inst` type, which represents a single machine instruction, and
defines emission routines to produce machine code from a `VCode`
container of `Insts`. The backend cannot produce `Inst`s with just this
patch; that will come with later parts.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
This patch also contains code written by Joey Gouly
<joey.gouly@arm.com> and contributed to the above branch. These
contributions are "Copyright (c) 2020, Arm Limited."
Finally, a contribution from Joey Gouly contains the following notice:
This is a port of VIXL's Assembler::IsImmLogical.
Arm has the original copyright on the VIXL code this was ported from
and is relicensing it under Apache 2 for Cranelift.
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
This patch adds the MachInst, or Machine Instruction, infrastructure.
This is the machine-independent portion of the new backend design. It
contains the implementation of the "vcode" (virtual-registerized code)
container, the top-level lowering algorithm and compilation pipeline,
and the trait definitions that the machine backends will fill in.
This backend infrastructure is included in the compilation of the
`codegen` crate, but it is not yet tied into the public APIs; that patch
will come last, after all the other pieces are filled in.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
This removes the old ARM64 backend completely, leaving only an empty
`arm64` module. The tree at this state will not build with the `arm64`
feature enabled, but that feature has to be enabled explicitly (it is
not default). Subsequent patches will fill in the new backend.
- Add a `simple_legalize()` function that invokes a predetermined set of
legalizations, without depending on the details of the current
backend design. This will be used by the new backend pipeline.
- Separate out `has_side_effect()` from the DCE pass. This will be used
by the new backends' lowering code.
- Add documentation for the `Arm64Call` relocation type.
Preserve FPRs as required by the Windows fastcall calling convention.
This exposes an implementation limit due to Cranelift's approach to stack layout, which conflicts with expectations Windows makes in SEH layout - functions where the Cranelift user desires fastcall unwind information, that require preservation of an ABI-reserved FPR, that have a stack frame 240 bytes or larger, now produce an error when compiled. Several wasm spectests were disabled because they would trip this limit. This is a temporary constraint that should be fixed promptly.
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
This exposes the functionality of `fde::map_reg` on the `TargetIsa` trait, avoiding compilation errors on architectures where register mapping is not yet supported. The change is conditially compiled under the `unwind` feature.
* wasmtime: Pass around more contexts instead of fields
This commit refactors some wasmtime internals to pass around more
context-style structures rather than individual fields of each
structure. The intention here is to make the addition of fields to a
structure easier to plumb throughout the internals of wasmtime.
Currently you need to edit lots of functions to pass lots of parameters,
but ideally after this you'll only need to edit one or two struct fields
and then relevant locations have access to the information already.
Updates in this commit are:
* `debug_info` configuration is now folded into `Tunables`. Additionally
a `wasmtime::Config` now holds a `Tunables` directly and is passed
into an internal `Compiler`. Eventually this should allow for direct
configuration of the `Tunables` attributes from the `wasmtime` API,
but no new configuration is exposed at this time.
* `ModuleTranslation` is now passed around as a whole rather than
passing individual components to allow access to all the fields,
including `Tunables`.
This was motivated by investigating what it would take to optionally
allow loops and such to get interrupted, but that sort of codegen
setting was currently relatively difficult to plumb all the way through
and now it's hoped to be largely just an addition to `Tunables`.
* Fix lightbeam compile
* Wasmtime 0.15.0 and Cranelift 0.62.0. (#1398)
* Bump more ad-hoc versions.
* Add build.rs to wasi-common's Cargo.toml.
* Update the env var name in more places.
* Remove a redundant echo.
As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when calculating the size of a recipe. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.
As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when figuring out if a REX prefix is needed in `needs_rex()`. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.
In cranelift x86 encodings, it seemed unintuitive to specialize Templates with both `infer_rex()`` and `w()`: if `w()` is specified, the REX.W bit must be set so a REX prefix is alway required--no need to infer it. This change forces us to write `rex().w()``--it's more explicit and shows more clearly what cranelift will emit. This change also modifies the tests that expected DynRex recipes.
Both cranelift-codegen and wasmtime-debug need to map Cranelift registers to Gimli registers. Previously both crates had an almost-identical `map_reg` implementation. This change:
- removes the wasmtime-debug implementation
- improves the cranelift-codegen implementation with custom errors
- exposes map_reg in `cranelift_codegen::isa::fde::map_reg` and subsequently `wasmtime_environ::isa::fde::map_reg`
- Convert recipes to have necessary size calculator
- Add a missing binemit function, `put_dynrexmp3`
- Modify the meta-encodings of x86 SIMD instructions to use `infer_rex()`, mostly through the `enc_both_inferred()` helper
- Fix up tests that previously always emitted a REX prefix
Until #1306 is resolved (some spilling/regalloc issue with larger FPR register banks), this removes FPR32 support. Only Wasm's `i64x2.mul` was using this register class and that instruction is predicated on AVX512 support; for the time being, that instruction will have to make do with the 16 FPR registers.
This patch updates or removes all references to the Cranelift repository. It affects links in README documents, issues that were transferred to the Wasmtime repository, CI badges, and a small bunch of sundry items.
The EVEX encoding format (e.g. in AVX-512) allows addressing 32 registers instead of 16. The FPR register class currently defines 16 registers, `%xmm0`-`%xmm15`; that class is kept as-is with this change. A larger class, FPR32, is added as a super-class of FPR using a larger bank of registers, `%xmm0`-`%xmm31`.