This patch provides an ARM64 implementation of the ABI-related traits
required by the new backend infrasturcture. It will be used by the
lowering code, when that is in place in a subsequent patch.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
This patch also contains code written by Joey Gouly
<joey.gouly@arm.com> and contributed to the above branch. These
contributions are "Copyright (c) 2020, Arm Limited."
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
This patch provides the bottom layer of the ARM64 backend: it defines
the `Inst` type, which represents a single machine instruction, and
defines emission routines to produce machine code from a `VCode`
container of `Insts`. The backend cannot produce `Inst`s with just this
patch; that will come with later parts.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
This patch also contains code written by Joey Gouly
<joey.gouly@arm.com> and contributed to the above branch. These
contributions are "Copyright (c) 2020, Arm Limited."
Finally, a contribution from Joey Gouly contains the following notice:
This is a port of VIXL's Assembler::IsImmLogical.
Arm has the original copyright on the VIXL code this was ported from
and is relicensing it under Apache 2 for Cranelift.
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
This patch adds the MachInst, or Machine Instruction, infrastructure.
This is the machine-independent portion of the new backend design. It
contains the implementation of the "vcode" (virtual-registerized code)
container, the top-level lowering algorithm and compilation pipeline,
and the trait definitions that the machine backends will fill in.
This backend infrastructure is included in the compilation of the
`codegen` crate, but it is not yet tied into the public APIs; that patch
will come last, after all the other pieces are filled in.
This patch contains code written by Julian Seward <jseward@acm.org> and
Benjamin Bouvier <public@benj.me>, originally developed on a side-branch
before rebasing and condensing into this patch series. See the `arm64`
branch at `https://github.com/cfallin/wasmtime` for original development
history.
Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Benjamin Bouvier <public@benj.me>
This removes the old ARM64 backend completely, leaving only an empty
`arm64` module. The tree at this state will not build with the `arm64`
feature enabled, but that feature has to be enabled explicitly (it is
not default). Subsequent patches will fill in the new backend.
- Add a `simple_legalize()` function that invokes a predetermined set of
legalizations, without depending on the details of the current
backend design. This will be used by the new backend pipeline.
- Separate out `has_side_effect()` from the DCE pass. This will be used
by the new backends' lowering code.
- Add documentation for the `Arm64Call` relocation type.
Preserve FPRs as required by the Windows fastcall calling convention.
This exposes an implementation limit due to Cranelift's approach to stack layout, which conflicts with expectations Windows makes in SEH layout - functions where the Cranelift user desires fastcall unwind information, that require preservation of an ABI-reserved FPR, that have a stack frame 240 bytes or larger, now produce an error when compiled. Several wasm spectests were disabled because they would trip this limit. This is a temporary constraint that should be fixed promptly.
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
This exposes the functionality of `fde::map_reg` on the `TargetIsa` trait, avoiding compilation errors on architectures where register mapping is not yet supported. The change is conditially compiled under the `unwind` feature.
* wasmtime: Pass around more contexts instead of fields
This commit refactors some wasmtime internals to pass around more
context-style structures rather than individual fields of each
structure. The intention here is to make the addition of fields to a
structure easier to plumb throughout the internals of wasmtime.
Currently you need to edit lots of functions to pass lots of parameters,
but ideally after this you'll only need to edit one or two struct fields
and then relevant locations have access to the information already.
Updates in this commit are:
* `debug_info` configuration is now folded into `Tunables`. Additionally
a `wasmtime::Config` now holds a `Tunables` directly and is passed
into an internal `Compiler`. Eventually this should allow for direct
configuration of the `Tunables` attributes from the `wasmtime` API,
but no new configuration is exposed at this time.
* `ModuleTranslation` is now passed around as a whole rather than
passing individual components to allow access to all the fields,
including `Tunables`.
This was motivated by investigating what it would take to optionally
allow loops and such to get interrupted, but that sort of codegen
setting was currently relatively difficult to plumb all the way through
and now it's hoped to be largely just an addition to `Tunables`.
* Fix lightbeam compile
As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when calculating the size of a recipe. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.
As explained in the added documentation and #1342, if we prevent `infer_rex()` and `w()` from being used together then we don't need to check whether the W bit is set when figuring out if a REX prefix is needed in `needs_rex()`. This should improve compile time for x86 very slightly since all `infer_rex()` instructions will no longer need this check.
Both cranelift-codegen and wasmtime-debug need to map Cranelift registers to Gimli registers. Previously both crates had an almost-identical `map_reg` implementation. This change:
- removes the wasmtime-debug implementation
- improves the cranelift-codegen implementation with custom errors
- exposes map_reg in `cranelift_codegen::isa::fde::map_reg` and subsequently `wasmtime_environ::isa::fde::map_reg`
- Convert recipes to have necessary size calculator
- Add a missing binemit function, `put_dynrexmp3`
- Modify the meta-encodings of x86 SIMD instructions to use `infer_rex()`, mostly through the `enc_both_inferred()` helper
- Fix up tests that previously always emitted a REX prefix
Until #1306 is resolved (some spilling/regalloc issue with larger FPR register banks), this removes FPR32 support. Only Wasm's `i64x2.mul` was using this register class and that instruction is predicated on AVX512 support; for the time being, that instruction will have to make do with the 16 FPR registers.
The EVEX encoding format (e.g. in AVX-512) allows addressing 32 registers instead of 16. The FPR register class currently defines 16 registers, `%xmm0`-`%xmm15`; that class is kept as-is with this change. A larger class, FPR32, is added as a super-class of FPR using a larger bank of registers, `%xmm0`-`%xmm31`.
With this change, register banks can now be re-ordered and other components (e.g. unwinding, regalloc) will no longer break. The previous behavior assumed that GPR registers always started at `RegUnit` 0.
Update the documentation for the merger, and also for various changes in
Cranelift. Remove some old obsolete documentation, and convert the remaining
Sphinx files to Markdown. Some of the remaining content is still out of
date, but this is a step forward.
This is a rebase of [1]. In the long term, we'll want to simplify these
analysis passes. For now, this is simple and will reduce the number of
instructions processed in certain cases.
[1] https://github.com/bytecodealliance/cranelift/pull/866
* Manually rename BasicBlock to BlockPredecessor
BasicBlock is a pair of (Ebb, Inst) that is used to represent the
basic block subcomponent of an Ebb that is a predecessor to an Ebb.
Eventually we will be able to remove this struct, but for now it
makes sense to give it a non-conflicting name so that we can start
to transition Ebb to represent a basic block.
I have not updated any comments that refer to BasicBlock, as
eventually we will remove BlockPredecessor and replace with Block,
which is a basic block, so the comments will become correct.
* Manually rename SSABuilder block types to avoid conflict
SSABuilder has its own Block and BlockData types. These along with
associated identifier will cause conflicts in a later commit, so
they are renamed to be more verbose here.
* Automatically rename 'Ebb' to 'Block' in *.rs
* Automatically rename 'EBB' to 'block' in *.rs
* Automatically rename 'ebb' to 'block' in *.rs
* Automatically rename 'extended basic block' to 'basic block' in *.rs
* Automatically rename 'an basic block' to 'a basic block' in *.rs
* Manually update comment for `Block`
`Block`'s wikipedia article required an update.
* Automatically rename 'an `Block`' to 'a `Block`' in *.rs
* Automatically rename 'extended_basic_block' to 'basic_block' in *.rs
* Automatically rename 'ebb' to 'block' in *.clif
* Manually rename clif constant that contains 'ebb' as substring to avoid conflict
* Automatically rename filecheck uses of 'EBB' to 'BB'
'regex: EBB' -> 'regex: BB'
'$EBB' -> '$BB'
* Automatically rename 'EBB' 'Ebb' to 'block' in *.clif
* Automatically rename 'an block' to 'a block' in *.clif
* Fix broken testcase when function name length increases
Test function names are limited to 16 characters. This causes
the new longer name to be truncated and fail a filecheck test. An
outdated comment was also fixed.
* All: Drop 'basic-blocks' feature
This makes it so that 'basic-blocks' cannot be disabled and we can
start assuming it everywhere.
* Tests: Replace non-bb filetests with bb version
* Tests: Adapt solver-fixedconflict filetests to use basic blocks
This commit aligns the representation of stackmaps to be the same
as Spidermonkey's by:
* Reversing the order of the bitmap from low addresses to high addresses
* Including incoming stack arguments
* Excluding outgoing stack arguments
Additionally, some accessor functions were added to allow Spidermonkey
to access the internals of the bitmap.
Previously, the assertion checked for `lane > 0` when it should have been `lane >= 0`; since lane is unsigned, this half of the assertion can be entirely removed.
* Use `is_wasm_parameter` in translating wasm calls
Added in #1329 it's now possible for multiple parameters to be non-wasm
parameters, so the previous `param_types` method is no longer suitable
for acquiring all wasm-related parameters, rather then `FuncEnvironment`
must be consulted. This removes usage of `param_types()` as a method
from the wasm translation and instead adds a custom method inline for
filtering the parameters based on `is_wasm_parameter`.
* Apply feedback
* Run rustfmt
* Don't require `mut`
* Run rustfmt
* Correctly count the number of wasm parameters.
Following up on #1329, this further replaces `num_normal_params` with a function
which calls `is_wasm_parameter` to correctly count the number of wasm
parameters a function has.
* Move is_wasm_parameter's implementation into the trait.
This is a breaking API change: the following settings have been renamed:
- jump_tables_enabled -> enable_jump_tables
- colocated_libcalls -> use_colocated_libcalls
- probestack_enabled -> enable_probestack
- allones_funcaddrs -> emit_all_ones_funcaddrs