* Fix default architecture for winch
This updates the `winch/codegen/build.rs` script to default to the
target architecture being compiled for as opposed to the host
architecture that's performing the compile.
Closes#6241
* Auto-enable other future architectures
This commit improves ABI support in Winch's trampolines mainly by:
* Adding support for the `fastcall` calling convention.
* By storing/restoring callee-saved registers.
One of the explicit goals of this change is to make tests available in the x86_64 target
as a whole and remove the need exclude the windows target.
This commit also introduces a `CallingConvention` enum, to better
reflect the subset of calling conventions that are supported by Winch.
* Adding in trampoline compiling method for ISA
* Adding support for indirect call to memory address
* Refactoring frame to externalize defined locals, so it removes WASM depedencies in trampoline case
* Adding initial version of trampoline for testing
* Refactoring trampoline to be re-used by other architectures
* Initial wiring for winch with wasmtime
* Add a Wasmtime CLI option to select `winch`
This is effectively an option to select the `Strategy` enumeration.
* Implement `Compiler::compile_function` for Winch
Hook this into the `TargetIsa::compile_function` hook as well. Currently
this doesn't take into account `Tunables`, but that's left as a TODO for
later.
* Filling out Winch append_code method
* Adding back in changes from previous branch
Most of these are a WIP. It's missing trampolines for x64, but a basic
one exists for aarch64. It's missing the handling of arguments that
exist on the stack.
It currently imports `cranelift_wasm::WasmFuncType` since it's what's
passed to the `Compiler` trait. It's a bit awkward to use in the
`winch_codegen` crate since it mostly operates on `wasmparser` types.
I've had to hack in a conversion to get things working. Long term, I'm
not sure it's wise to rely on this type but it seems like it's easier on
the Cranelift side when creating the stub IR.
* Small API changes to make integration easier
* Adding in new FuncEnv, only a stub for now
* Removing unneeded parts of the old PoC, and refactoring trampoline code
* Moving FuncEnv into a separate file
* More comments for trampolines
* Adding in winch integration tests for first pass
* Using new addressing method to fix stack pointer error
* Adding test for stack arguments
* Only run tests on x86 for now, it's more complete for winch
* Add in missing documentation after rebase
* Updating based on feedback in draft PR
* Fixing formatting on doc comment for argv register
* Running formatting
* Lock updates, and turning on winch feature flags during tests
* Updating configuration with comments to no longer gate Strategy enum
* Using the winch-environ FuncEnv, but it required changing the sig
* Proper comment formatting
* Removing wasmtime-winch from dev-dependencies, adding the winch feature makes this not necessary
* Update doc attr to include winch check
* Adding winch feature to doc generation, which seems to fix the feature error in CI
* Add the `component-model` feature to the cargo doc invocation in CI
To match the metadata used by the docs.rs invocation when building docs.
* Add a comment clarifying the usage of `component-model` for docs.rs
* Correctly order wasmtime-winch and winch-environ in the publish script
* Ensure x86 test dependencies are included in cfg(target_arch)
* Further constrain Winch tests to x86_64 _and_ unix
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>
* winch(x64): Initial implementation for function calls
This change adds the main building blocks for calling locally defined
functions. Support for function imports will be added iteratively after this
change lands and once trampolines are supported.
To support function calls, this change introduces the following functionality to
the MacroAssembler:
* `pop` to pop the machine stack into a given register, which in the case of
this change, translates to the x64 pop instruction.
* `call` to a emit a call to locally defined functions.
* `address_from_sp` to construct memory addresses with the SP as a base.
* `free_stack` to emit the necessary instrunctions to claim stack space.
The heavy lifting of setting up and emitting the function call is done through
the implementation of `FnCall`.
* Fix spill behaviour in function calls and add more documentation
This commits adds a more detailed documentation to the `call.rs` module.
It also fixes a couple of bugs, mainly:
* The previous commit didn't account for memory addresses used as arguments for
the function call, any memory entry in the value stack used as a function
argument should be tracked and then used to claim that memory when the function
call ends. We could `pop` and do this implicitly, but we can also track this
down and emit a single instruction to decrement the stack pointer, which will
result in better code.
* Introduce a differentiator between addresses relative or absolute to the stack
pointer. When passing arguments in the stack -- assuming that SP at that point
is aligned for the function call -- we should store the arguments relative to
the absolute position of the stack pointer and when addressing a memory entry in
the Wasm value stack, we should use an address relative to the offset and the
position of the stack pointer.
* Simplify tracking of the stack space needed for emitting a function call
* Add a `MachBuffer::defer_trap` method
This commit adds a new method to `MachBuffer` to defer trap opcodes to
the end of a function in a similar manner to how constants are deferred
to the end of the function. This is useful for backends which frequently
use `TrapIf`-style opcodes. Currently a jump is emitted which skips the
next instruction, a trap, and then execution continues normally. While
there isn't any pressing problem with this construction the trap opcode
is in the middle of the instruction stream as opposed to "off on the
side" despite rarely being taken.
With this method in place all the backends (except riscv64 since I
couldn't figure it out easily enough) have a new lowering of their
`TrapIf` opcode. Now a trap is deferred, which returns a label, and then
that label is jumped to when executing the trap. A fixup is then
recorded in `MachBuffer` to get patched later on during emission, or at
the end of the function. Subsequently all `TrapIf` instructions
translate to a single branch plus a single trap at the end of the
function.
I've additionally further updated some more lowerings in the x64 backend
which were explicitly using traps to instead use `TrapIf` where
applicable to avoid jumping over traps mid-function. Other backends
didn't appear to have many jump-over-the-next-trap patterns.
Lots of tests have had their expectations updated here which should
reflect all the traps being sunk to the end of functions.
* Print trap code on all platforms
* Emit traps before constants
* Preserve source location information for traps
* Fix test expectations
* Attempt to fix s390x
The MachBuffer was registering trap codes with the first byte of the
trap, but the SIGILL handler was expecting it to be registered with the
last byte of the trap. Exploit that SIGILL is always represented with a
2-byte instruction and always march 2-backwards for SIGILL, continuing
to march backwards 1 byte for SIGFPE-generating instructions.
* Back out s390x changes
* Back out more s390x bits
* Review comments
* x64: Take SIGFPE signals for divide traps
Prior to this commit Wasmtime would configure `avoid_div_traps=true`
unconditionally for Cranelift. This, for the division-based
instructions, would change emitted code to explicitly trap on trap
conditions instead of letting the `div` x86 instruction trap.
There's no specific reason for Wasmtime, however, to specifically avoid
traps in the `div` instruction. This means that the extra generated
branches on x86 aren't necessary since the `div` and `idiv` instructions
already trap for similar conditions as wasm requires.
This commit instead disables the `avoid_div_traps` setting for
Wasmtime's usage of Cranelift. Subsequently the codegen rules were
updated slightly:
* When `avoid_div_traps=true`, traps are no longer emitted for `div`
instructions.
* The `udiv`/`urem` instructions now list their trap as divide-by-zero
instead of integer overflow.
* The lowering for `sdiv` was updated to still explicitly check for zero
but the integer overflow case is deferred to the instruction itself.
* The lowering of `srem` no longer checks for zero and the listed trap
for the `div` instruction is a divide-by-zero.
This means that the codegen for `udiv` and `urem` no longer have any
branches. The codegen for `sdiv` removes one branch but keeps the
zero-check to differentiate the two kinds of traps. The codegen for
`srem` removes one branch but keeps the -1 check since the semantics of
`srem` mismatch with the semantics of `idiv` with a -1 divisor
(specifically for INT_MIN).
This is unlikely to have really all that much of a speedup but was
something I noticed during #6008 which seemed like it'd be good to clean
up. Plus Wasmtime's signal handling was already set up to catch
`SIGFPE`, it was just never firing.
* Remove the `avoid_div_traps` cranelift setting
With no known users currently removing this should be possible and helps
simplify the x64 backend.
* x64: GC more support for avoid_div_traps
Remove the `validate_sdiv_divisor*` pseudo-instructions and clean up
some of the ISLE rules now that `div` is allowed to itself trap
unconditionally.
* x64: Store div trap code in instruction itself
* Keep divisors in registers, not in memory
Don't accidentally fold multiple traps together
* Handle EXC_ARITHMETIC on macos
* Update emit tests
* Update winch and tests
This commit introduces the `winch-environ` crate. This crate's responsibility is
to provide a shared implementatation of the `winch_codegen::FuncEnv` trait,
which is Winch's function compilation environment, used to resolve module and
runtime specific information needed by the code generation, such as resolving
all the details about a callee in a WebAssembly module, or resolving specific
information from the `VMContext`.
As of this change, the implementation only includes the necessary pieces to
resolve a function callee in a WebAssembly module. The idea is to evolve the
`winch_codegen::FuncEnv` trait as we evolve Winch's code generation.
* x64: Add precise-output tests for div traps
This adds a suite of `*.clif` files which are intended to test the
`avoid_div_traps=true` compilation of the `{s,u}{div,rem}` instructions.
* x64: Remove conditional regalloc in `Div` instruction
Move the 8-bit `Div` logic into a dedicated `Div8` instruction to avoid
having conditionally-used registers with respect to regalloc.
* x64: Migrate non-trapping, `udiv`/`urem` to ISLE
* x64: Port checked `udiv` to ISLE
* x64: Migrate urem entirely to ISLE
* x64: Use `test` instead of `cmp` to compare-to-zero
* x64: Port `sdiv` lowering to ISLE
* x64: Port `srem` lowering to ISLE
* Tidy up regalloc behavior and fix tests
* Update docs and winch
* Review comments
* Reword again
* More refactoring test fixes
* More test fixes
* Enable the native target by default in winch
Match cranelift-codegen's build script where if no architecture is
explicitly enabled then the host architecture is implicitly enabled.
* Refactor Cranelift's ISA builder to share more with Winch
This commit refactors the `Builder` type to have a type parameter
representing the finished ISA with Cranelift and Winch having their own
typedefs for `Builder` to represent their own builders. The intention is
to use this shared functionality to produce more shared code between the
two codegen backends.
* Moving compiler shared components to a separate crate
* Restore native flag inference in compiler building
This fixes an oversight from the previous commits to use
`cranelift-native` to infer flags for the native host when using default
settings with Wasmtime.
* Move `Compiler::page_size_align` into wasmtime-environ
The `cranelift-codegen` crate doesn't need this and winch wants the same
implementation, so shuffle it around so everyone has access to it.
* Fill out `Compiler::{flags, isa_flags}` for Winch
These are easy enough to plumb through with some shared code for
Wasmtime.
* Plumb the `is_branch_protection_enabled` flag for Winch
Just forwarding an isa-specific setting accessor.
* Moving executable creation to shared compiler crate
* Adding builder back in and removing from shared crate
* Refactoring the shared pieces for the `CompilerBuilder`
I decided to move a couple things around from Alex's initial changes.
Instead of having the shared builder do everything, I went back to
having each compiler have a distinct builder implementation. I
refactored most of the flag setting logic into a single shared location,
so we can still reduce the amount of code duplication.
With them being separate, we don't need to maintain things like
`LinkOpts` which Winch doesn't currently use. We also have an avenue to
error when certain flags are sent to Winch if we don't support them. I'm
hoping this will make things more maintainable as we build out Winch.
I'm still unsure about keeping everything shared in a single crate
(`cranelift_shared`). It's starting to feel like this crate is doing too
much, which makes it difficult to name. There does seem to be a need for
two distinct abstraction: creating the final executable and the handling
of shared/ISA flags when building the compiler. I could make them into
two separate crates, but there doesn't seem to be enough there yet to
justify it.
* Documentation updates, and renaming the finish method
* Adding back in a default temporarily to pass tests, and removing some unused imports
* Fixing winch tests with wrong method name
* Removing unused imports from codegen shared crate
* Apply documentation formatting updates
Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>
* Adding back in cranelift_native flag inferring
* Adding new shared crate to publish list
* Adding write feature to pass cargo check
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>
* Refactor the structure and responsibilities of `CodeGenContext`
This commit refactors how the `CodeGenContext` is used throughout the code
generation process, making it easier to pass it around when more flexibility is
desired in the MacroAssembler to perform the lowering of certain instructions.
As of this change, the responsibility of the `CodeGenContext` is to provide an
interface for operations that require an orchestration between the register
allocator, the value stack and function's frame. The MacroAssembler is removed
from the CodeGenContext as is passed as a dependency where needed, effectly
using it as an independent code generation interface only.
By giving more responsibilities to the `CodeGenContext` we can clearly separate
the concerns of the register allocator, which previously did more than it
should (e.g. popping values and spilling).
This change ultimately allows passing in the `CodeGenContext` to the
`MacroAssembler` when a given instruction cannot be generically described
through a common interface. Allowing each implementation to decide the best way
to lower a particular instruction.
* winch: Add support for the WebAssembly `<i32|i64>.div_*` instructions
Given that some architectures have very specific requirements on how to handle
division, this change uses `CodeGenContext` as a dependency to the `div`
MacroAssembler instruction to ensure that each implementation can decide on how to lower the
division. This approach also allows -- in architectures where division can be
expressed as an ordinary binary operation -- to rely on the
`CodeGenContext::i32_binop` or `CodeGenContext::i64_binop` helpers.
This patch adds complete support for the `sub` and `add` WebAssembly instructions
for x64, and complete support for the `add` WebAssembly instruction for aarch64.
This patch also refactors how the binary operations get constructed within the
`VisitOperator` trait implementation. The refactor adds methods in the
`CodeGenContext` to abstract all the common steps to emit binary operations,
making this process less repetitive and less brittle (e.g. omitting to push the resulting value
to the stack, or omitting to free registers after used).
This patch also improves test coverage and refactors the filetests directory to make it
easier to add tests for other instructions.
This commit fixes an incorrect usage of `func_type_at` to retrieve a defined
function signature and instead uses `function_at` to retrieve the signature.
Additionally it enhances `winch-tools` `compile` and `test` commands to handle
modules with multiple functions correctly.
This commit adds some missing conversions between Winch's x64 `Reg` type and
Cranelift's `Gpr`, `WritableGpr` and `GprMemImm`. This results in less
boilerplate. This is also a bit of groundwork in the assembler to support
the rest of the integer binary instructions.
This patch introduces basic aarch64 code generation by using
`cranelift-codegen`'s backend.
This commit *does not*:
* Change the semantics of the code generation
* Adds support for other Wasm instructions
The most notable change in this patch is how addressing modes are handled at the
MacroAssembler layer: instead of having a canonical address representation, this
patch introduces the addressing mode as an associated type in the
MacroAssembler trait. This approach has the advantage that gives each ISA enough
flexiblity to describe the addressing modes and their constraints in isolation
without having to worry on how a particular addressing mode is going to affect
other ISAs. In the case of Aarch64 this becomes useful to describe indexed
addressing modes (particularly from the stack pointer).
This patch uses the concept of a shadow stack pointer (x28) as a workaround to
Aarch64's stack pointer 16-byte alignment. This constraint is enforced by:
* Introducing specialized addressing modes when using the real stack pointer; this
enables auditing when the real stack pointer is used. As of this change, the
real stack pointer is only used in the function's prologue and epilogue.
* Asserting that the real stack pointer is not used as a base for addressing
modes.
* Ensuring that at any point during the code generation process where the stack
pointer changes (e.g. when stack space is allocated / deallocated) the value of
the real stack pointer is copied into the shadow stack pointer.
This commit contains a small set of clean up items for x64.
Notably:
* Adds filetests
* Documents why 16 for the arg base offset abi implementation, for clarity.
* Fixes a bug in the spill implementation caught while anlyzing the
filetests results. The fix consists of emitting a load instead of a store into
the scratch register before spiiling its value.
* Remove dead code for pretty printing registers which is not needed anymore
since we now have proper disassembly.
* Adding in the foundations for Winch `filetests`
This commit adds two new crates into the Winch workspace:
`filetests` and `test-macros`. The intent is to mimic the
structure of Cranelift `filetests`, but in a simpler way.
* Updates to documentation
This commits adds a high level document to outline how to test Winch
through the `winch-tools` utility. It also updates some inline
documentation which gets propagated to the CLI.
* Updating test-macro to use a glob instead of only a flat directory
* Add release notes for 3.0.1
* Update some version directives for crates in Wasmtime
* Mark anything with `publish = false` as version 0.0.0
* Mark the icache coherence crate with the same version as Wasmtime
* Fix manifest directives
This commit prepares the `winch` crate for updating `wasm-tools`,
notably changing a bit about how the visitation of operators works. This
moves the function body and wasm validator out of the `CodeGen`
structure and into parameters threaded into the emission of the actual
function.
Additionally the `VisitOperator` implementation was updated to remove
the explicit calls to the validator, favoring instead a macro-generated
solution to guarantee that all validation happens before any translation
proceeds. This means that the `VisitOperator for CodeGen` impl is now
infallible and the various methods have been inlined into the trait
methods as well as removing the `Result<_>`.
Finally this commit updates translation to call `validator.finish(..)`
which is required to perform the final validation steps of the function
body.
* Pull `Module` out of `ModuleTextBuilder`
This commit is the first in what will likely be a number towards
preparing for serializing a compiled component to bytes, a precompiled
artifact. To that end my rough plan is to merge all of the compiled
artifacts for a component into one large object file instead of having
lots of separate object files and lots of separate mmaps to manage. To
that end I plan on eventually using `ModuleTextBuilder` to build one
large text section for all core wasm modules and trampolines, meaning
that `ModuleTextBuilder` is no longer specific to one module. I've
extracted out functionality such as function name calculation as well as
relocation resolving (now a closure passed in) in preparation for this.
For now this just keeps tests passing, and the trajectory for this
should become more clear over the following commits.
* Remove component-specific object emission
This commit removes the `ComponentCompiler::emit_obj` function in favor
of `Compiler::emit_obj`, now renamed `append_code`. This involved
significantly refactoring code emission to take a flat list of functions
into `append_code` and the caller is responsible for weaving together
various "families" of functions and un-weaving them afterwards.
* Consolidate ELF parsing in `CodeMemory`
This commit moves the ELF file parsing and section iteration from
`CompiledModule` into `CodeMemory` so one location keeps track of
section ranges and such. This is in preparation for sharing much of this
code with components which needs all the same sections to get tracked
but won't be using `CompiledModule`. A small side benefit from this is
that the section parsing done in `CodeMemory` and `CompiledModule` is no
longer duplicated.
* Remove separately tracked traps in components
Previously components would generate an "always trapping" function
and the metadata around which pc was allowed to trap was handled
manually for components. With recent refactorings the Wasmtime-standard
trap section in object files is now being generated for components as
well which means that can be reused instead of custom-tracking this
metadata. This commit removes the manual tracking for the `always_trap`
functions and plumbs the necessary bits around to make components look
more like modules.
* Remove a now-unnecessary `Arc` in `Module`
Not expected to have any measurable impact on performance, but
complexity-wise this should make it a bit easier to understand the
internals since there's no longer any need to store this somewhere else
than its owner's location.
* Merge compilation artifacts of components
This commit is a large refactoring of the component compilation process
to produce a single artifact instead of multiple binary artifacts. The
core wasm compilation process is refactored as well to share as much
code as necessary with the component compilation process.
This method of representing a compiled component necessitated a few
medium-sized changes internally within Wasmtime:
* A new data structure was created, `CodeObject`, which represents
metadata about a single compiled artifact. This is then stored as an
`Arc` within a component and a module. For `Module` this is always
uniquely owned and represents a shuffling around of data from one
owner to another. For a `Component`, however, this is shared amongst
all loaded modules and the top-level component.
* The "module registry" which is used for symbolicating backtraces and
for trap information has been updated to account for a single region
of loaded code holding possibly multiple modules. This involved adding
a second-level `BTreeMap` for now. This will likely slow down
instantiation slightly but if it poses an issue in the future this
should be able to be represented with a more clever data structure.
This commit additionally solves a number of longstanding issues with
components such as compiling only one host-to-wasm trampoline per
signature instead of possibly once-per-module. Additionally the
`SignatureCollection` registration now happens once-per-component
instead of once-per-module-within-a-component.
* Fix compile errors from prior commits
* Support AOT-compiling components
This commit adds support for AOT-compiled components in the same manner
as `Module`, specifically adding:
* `Engine::precompile_component`
* `Component::serialize`
* `Component::deserialize`
* `Component::deserialize_file`
Internally the support for components looks quite similar to `Module`.
All the prior commits to this made adding the support here
(unsurprisingly) easy. Components are represented as a single object
file as are modules, and the functions for each module are all piled
into the same object file next to each other (as are areas such as data
sections). Support was also added here to quickly differentiate compiled
components vs compiled modules via the `e_flags` field in the ELF
header.
* Prevent serializing exported modules on components
The current representation of a module within a component means that the
implementation of `Module::serialize` will not work if the module is
exported from a component. The reason for this is that `serialize`
doesn't actually do anything and simply returns the underlying mmap as a
list of bytes. The mmap, however, has `.wasmtime.info` describing
component metadata as opposed to this module's metadata. While rewriting
this section could be implemented it's not so easy to do so and is
otherwise seen as not super important of a feature right now anyway.
* Fix windows build
* Fix an unused function warning
* Update crates/environ/src/compilation.rs
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Initial skeleton for Winch
This commit introduces the initial skeleton for Winch, the "baseline"
compiler.
This skeleton contains mostly setup code for the ISA, ABI, registers,
and compilation environment abstractions. It also includes the
calculation of function local slots.
As of this commit, the structure of these abstractions looks like the
following:
+------------------------+
| v
+----------+ +-----+ +-----------+-----+-----------------+
| Compiler | --> | ISA | --> | Registers | ABI | Compilation Env |
+----------+ +-----+ +-----------+-----+-----------------+
| ^
+------------------------------+
* Compilation environment will hold a reference to the function data
* Add basic documentation to the ABI trait
* Enable x86 and arm64 in cranelift-codegen
* Add reg_name function for x64
* Introduce the concept of a MacroAssembler and Assembler
This commit introduces the concept of a MacroAsesembler and
Assembler. The MacroAssembler trait will provide a high enough
interface across architectures so that each ISA implementation can use their own low-level
Assembler implementation to fulfill the interface. Each Assembler will
provide a 1-1 mapping to each ISA instruction.
As of this commit, only a partial debug implementation is provided for
the x64 Assembler.
* Add a newtype over PReg
Adds a newtype `Reg` over regalloc2::PReg; this ensures that Winch
will operate only on the concept of `Reg`. This change is temporary
until we have the necessary machinery to share a common Reg
abstraction via `cranelift_asm`
* Improvements to local calcuation
- Add `LocalSlot::addressed_from_sp`
- Use `u32` for local slot and local sizes calculation
* Add helper methods to ABIArg
Adds helper methods to retrieve register and type information from the argument
* Make locals_size public in frame
* Improve x64 register naming depending on size
* Add new methods to the masm interface
This commit introduces the ability for the MacroAssembler to reserve
stack space, get the address of a given local and perform a stack
store based on the concept of `Operand`s.
There are several motivating factors to introduce the concept of an
Operand:
- Make the translation between Winch and Cranelift easier;
- Make dispatching from the MacroAssembler to the underlying Assembler
- easier by minimizing the amount of functions that we need to define
- in order to satisfy the store/load combinations
This commit also introduces the concept of a memory address, which
essentially describes the addressing modes; as of this commit only one
addressing mode is supported. We'll also need to verify that this
structure will play nicely with arm64.
* Blank masm implementation for arm64
* Implementation of reserve_stack, local_address, store and fp_offset
for x64
* Implement function prologue and argument register spilling
* Add structopt and wat
* Fix debug instruction formatting
* Make TargetISA trait publicly accessible
* Modify the MacroAssembler finalize siganture to return a slice of strings
* Introduce a simple CLI for Winch
To be able to compile Wasm programs with Winch independently. Mostly
meant for testing / debugging
* Fix bug in x64 assembler mov_rm
* Remove unused import
* Move the stack slot calculation to the Frame
This commit moves the calculation of the stack slots to the frame
handler abstraction and also includes the calculation of the limits
for the function defined locals, which will be used to zero the locals
that are not associated to function arguments
* Add i32 and i64 constructors to local slots
* Introduce the concept of DefinedLocalsRange
This commit introduces `DefinedLocalsRange` to track the stack offset
at which the function-defined locals start and end; this is later used
to zero-out that stack region
* Add constructors for int and float registers
* Add a placeholder stack implementation
* Add a regset abstraction to track register availability
Adds a bit set abstraction to track register availability for register
allocation.
The bit set has no specific knowledge about physical registers, it
works on the register's hardware encoding as the source of truth.
Each RegSet is expected to be created with the universe of allocatable
registers per ISA when starting the compilation of a particular function.
* Add an abstraction over register and immediate
This is meant to be used as the source for stores.
* Add a way to zero local slots and an initial skeletion of regalloc
This commit introduces `zero_local_slots` to the MacroAssembler; which
ensures that function defined locals are zeroed out when starting the
function body.
The algorithm divides the defined function locals stack range
into 8 byte slots and stores a zero at each address. This process
relies on register allocation if the amount of slots that need to be
initialized is greater than 1. In such case, the next available
register is requested to the register set and it's used to store a 0,
which is then stored at every local slot
* Update to wasmparser 0.92
* Correctly track if the regset has registers available
* Add a result entry to the ABI signature
This commuit introduces ABIResult as part of the ABISignature;
this struct will track how function results are stored; initially it
will consiste of a single register that will be requested to the
register allocator at the end of the function; potentially causing a spill
* Move zero local slots and add more granular methods to the masm
This commit removes zeroing local slots from the MacroAssembler and
instead adds more granular methods to it (e.g `zero`, `add`).
This allows for better code sharing since most of the work done by the
algorithm for zeroing slots will be the same in all targets, except
for the binary emissions pieces, which is what gets delegated to the masm
* Use wasmparser's visitor API and add initial support for const and add
This commit adds initial support for the I32Const and I32
instructions; this involves adding a minimum for register
allocation. Note that some regalloc pieces are still incomplete, since
for the current set of supported instructions they are not needed.
* Make the ty field public in Local
* Add scratch_reg to the abi
* Add a method to get a particular local from the Frame
* Split the compilation environment abstraction
This commit splits the compilation environment into two more concise
abstractions:
1. CodeGen: the main abstraction for code generation
2. CodeGenContext: abstraction that shares the common pieces for
compilation; these pieces are shared between the code generator and
the register allocator
* Add `push` and `load` to the MacroAssembler
* Remove dead code warnings for unused paths
* Map ISA features to cranelift-codegen ISA features
* Apply formatting
* Fix Cargo.toml after a bad rebase
* Add component-compiler feature
* Use clap instead of structopt
* Add winch to publish.rs script
* Minor formatting
* Add tests to RegSet and fix two bugs when freeing and checking for
register availability
* Add tests to Stack
* Free source register after a non-constant i32 add
* Improve comments
- Remove unneeded comments
- And improve some of the TODO items
* Update default features
* Drop the ABI generic param and pass the word_size information directly
To avoid dealing with dead code warnings this commit passes the word
size information directly, since it's the only piece of information
needed from the ABI by Codegen until now
* Remove dead code
This piece of code will be put back once we start integrating Winch
with Wasmtime
* Remove unused enum variant
This variant doesn't get constructed; it should be added back once a
backend is added and not enabled by default or when Winch gets
integrated into Wasmtime
* Fix unused code in regset tests
* Update spec testsuite
* Switch the visitor pattern for a simpler operator match
This commit removes the usage of wasmparser's visitor pattern and
instead defaults to a simpler operator matching approach. This removes
the complexity of having to define all the visitor trait functions at once.
* Use wasmparser's Visitor trait with a different macro strategy
This commit puts back wasmparser's Visitor trait, with a sigle;
simpler macro, only used for unsupported operators.
* Restructure Winch
This commit restuructures Winch's parts. It divides the initial
approach into three main crates: `winch-codegen`,`wasmtime-winch` and `winch-tools`.
`wasmtime-winch` is reponsible for the Wasmtime-Winch integration.
`winch-codegen` is solely responsible for code generation.
`winch-tools` is CLI tool to compile Wasm programs, mainly for testing purposes.
* Refactor zero local slots
This commit moves the logic of zeroing local slots from the codegen
module into a method with a default implementation in the
MacroAssembler trait: `zero_mem_range`.
The refactored implementation is very similar to the previous
implementation with the only difference
that it doesn't allocates a general-purpose register; it instead uses
the register allocator to retrieve the scratch register and uses this
register to unroll the series of zero stores.
* Tie the codegen creation to the ISA ABI
This commit makes the relationship between the ISA ABI and the codegen
explicit. This allows us to pass down ABI-specific bit and pieces to
the codegeneration. In this case the only concrete piece that we need
is the ABI word size.
* Mark winch as publishable directory
* Revamp winch docs
This commit ensures that all the code comments in Winch are compliant
with the syle used in the rest of Wasmtime's codebase.
It also imptoves, generally the quality of the comments in some modules.
* Panic when using multi-value when the target is aarch64
Similar to x64, this commit ensures that the abi signature of the
current function doesn't use multi-value returns
* Document the usage of directives
* Use endianness instead of endianess in the ISA trait
* Introduce a three-argument form in the MacroAssembler
This commit introduces the usage of three-argument form for the
MacroAssembler interface. This allows for a natural mapping for
architectures like aarch64. In the case of x64, the implementation can
simply restrict the implementation asserting for equality in two of
the arguments of defaulting to a differnt set of instructions.
As of this commit, the implementation of `add` panics if the
destination and the first source arguments are not equal; internally
the x64 assembler implementation will ensure that all the allowed
combinations of `add` are satisfied. The reason for panicking and not
emitting a `mov` followed by an `add` for example is simply because register
allocation happens right before calling `add`, which ensures any
register-to-register moves, if needed.
This implementation will evolve in the future and this panic will be
lifted if needed.
* Improve the documentation for the MacroAssembler.
Documents the usage of three-arg form and the intention around the
high-level interface.
* Format comments in remaining modules
* Clean up Cargo.toml for winch pieces
This commit adds missing fields to each of Winch's Cargo.toml.
* Use `ModuleTranslation::get_types()` to derive the function type
* Assert that start range is always word-size aligned