wasmtime

Author	SHA1	Message	Date
Saúl Cabrera	9dd0b59c2a	winch(x64): Improve ABI support in trampolines (#6204 ) This commit improves ABI support in Winch's trampolines mainly by: * Adding support for the `fastcall` calling convention. * By storing/restoring callee-saved registers. One of the explicit goals of this change is to make tests available in the x86_64 target as a whole and remove the need exclude the windows target. This commit also introduces a `CallingConvention` enum, to better reflect the subset of calling conventions that are supported by Winch.	2023-04-14 21:13:23 +00:00
Kevin Rizzo	3a92aa3d0a	winch: Initial integration with wasmtime (#6119 ) * Adding in trampoline compiling method for ISA * Adding support for indirect call to memory address * Refactoring frame to externalize defined locals, so it removes WASM depedencies in trampoline case * Adding initial version of trampoline for testing * Refactoring trampoline to be re-used by other architectures * Initial wiring for winch with wasmtime * Add a Wasmtime CLI option to select `winch` This is effectively an option to select the `Strategy` enumeration. * Implement `Compiler::compile_function` for Winch Hook this into the `TargetIsa::compile_function` hook as well. Currently this doesn't take into account `Tunables`, but that's left as a TODO for later. * Filling out Winch append_code method * Adding back in changes from previous branch Most of these are a WIP. It's missing trampolines for x64, but a basic one exists for aarch64. It's missing the handling of arguments that exist on the stack. It currently imports `cranelift_wasm::WasmFuncType` since it's what's passed to the `Compiler` trait. It's a bit awkward to use in the `winch_codegen` crate since it mostly operates on `wasmparser` types. I've had to hack in a conversion to get things working. Long term, I'm not sure it's wise to rely on this type but it seems like it's easier on the Cranelift side when creating the stub IR. * Small API changes to make integration easier * Adding in new FuncEnv, only a stub for now * Removing unneeded parts of the old PoC, and refactoring trampoline code * Moving FuncEnv into a separate file * More comments for trampolines * Adding in winch integration tests for first pass * Using new addressing method to fix stack pointer error * Adding test for stack arguments * Only run tests on x86 for now, it's more complete for winch * Add in missing documentation after rebase * Updating based on feedback in draft PR * Fixing formatting on doc comment for argv register * Running formatting * Lock updates, and turning on winch feature flags during tests * Updating configuration with comments to no longer gate Strategy enum * Using the winch-environ FuncEnv, but it required changing the sig * Proper comment formatting * Removing wasmtime-winch from dev-dependencies, adding the winch feature makes this not necessary * Update doc attr to include winch check * Adding winch feature to doc generation, which seems to fix the feature error in CI * Add the `component-model` feature to the cargo doc invocation in CI To match the metadata used by the docs.rs invocation when building docs. * Add a comment clarifying the usage of `component-model` for docs.rs * Correctly order wasmtime-winch and winch-environ in the publish script * Ensure x86 test dependencies are included in cfg(target_arch) * Further constrain Winch tests to x86_64 _and_ unix --------- Co-authored-by: Alex Crichton <alex@alexcrichton.com> Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>	2023-04-05 00:32:40 +00:00
Saúl Cabrera	af4d94c85a	winch(x64): Initial implementation for function calls (#6067 ) * winch(x64): Initial implementation for function calls This change adds the main building blocks for calling locally defined functions. Support for function imports will be added iteratively after this change lands and once trampolines are supported. To support function calls, this change introduces the following functionality to the MacroAssembler: * `pop` to pop the machine stack into a given register, which in the case of this change, translates to the x64 pop instruction. * `call` to a emit a call to locally defined functions. * `address_from_sp` to construct memory addresses with the SP as a base. * `free_stack` to emit the necessary instrunctions to claim stack space. The heavy lifting of setting up and emitting the function call is done through the implementation of `FnCall`. * Fix spill behaviour in function calls and add more documentation This commits adds a more detailed documentation to the `call.rs` module. It also fixes a couple of bugs, mainly: * The previous commit didn't account for memory addresses used as arguments for the function call, any memory entry in the value stack used as a function argument should be tracked and then used to claim that memory when the function call ends. We could `pop` and do this implicitly, but we can also track this down and emit a single instruction to decrement the stack pointer, which will result in better code. * Introduce a differentiator between addresses relative or absolute to the stack pointer. When passing arguments in the stack -- assuming that SP at that point is aligned for the function call -- we should store the arguments relative to the absolute position of the stack pointer and when addressing a memory entry in the Wasm value stack, we should use an address relative to the offset and the position of the stack pointer. * Simplify tracking of the stack space needed for emitting a function call	2023-03-28 18:30:31 +00:00
Kevin Rizzo	013b35ff32	winch: Refactoring wasmtime compiler integration pieces to share more between Cranelift and Winch (#5944 ) * Enable the native target by default in winch Match cranelift-codegen's build script where if no architecture is explicitly enabled then the host architecture is implicitly enabled. * Refactor Cranelift's ISA builder to share more with Winch This commit refactors the `Builder` type to have a type parameter representing the finished ISA with Cranelift and Winch having their own typedefs for `Builder` to represent their own builders. The intention is to use this shared functionality to produce more shared code between the two codegen backends. * Moving compiler shared components to a separate crate * Restore native flag inference in compiler building This fixes an oversight from the previous commits to use `cranelift-native` to infer flags for the native host when using default settings with Wasmtime. * Move `Compiler::page_size_align` into wasmtime-environ The `cranelift-codegen` crate doesn't need this and winch wants the same implementation, so shuffle it around so everyone has access to it. * Fill out `Compiler::{flags, isa_flags}` for Winch These are easy enough to plumb through with some shared code for Wasmtime. * Plumb the `is_branch_protection_enabled` flag for Winch Just forwarding an isa-specific setting accessor. * Moving executable creation to shared compiler crate * Adding builder back in and removing from shared crate * Refactoring the shared pieces for the `CompilerBuilder` I decided to move a couple things around from Alex's initial changes. Instead of having the shared builder do everything, I went back to having each compiler have a distinct builder implementation. I refactored most of the flag setting logic into a single shared location, so we can still reduce the amount of code duplication. With them being separate, we don't need to maintain things like `LinkOpts` which Winch doesn't currently use. We also have an avenue to error when certain flags are sent to Winch if we don't support them. I'm hoping this will make things more maintainable as we build out Winch. I'm still unsure about keeping everything shared in a single crate (`cranelift_shared`). It's starting to feel like this crate is doing too much, which makes it difficult to name. There does seem to be a need for two distinct abstraction: creating the final executable and the handling of shared/ISA flags when building the compiler. I could make them into two separate crates, but there doesn't seem to be enough there yet to justify it. * Documentation updates, and renaming the finish method * Adding back in a default temporarily to pass tests, and removing some unused imports * Fixing winch tests with wrong method name * Removing unused imports from codegen shared crate * Apply documentation formatting updates Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com> * Adding back in cranelift_native flag inferring * Adding new shared crate to publish list * Adding write feature to pass cargo check --------- Co-authored-by: Alex Crichton <alex@alexcrichton.com> Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>	2023-03-08 15:07:13 +00:00
Saúl Cabrera	7ec925122d	winch: Add support for the `<i32\|i64>.div_` instructions (#5807 ) Refactor the structure and responsibilities of `CodeGenContext` This commit refactors how the `CodeGenContext` is used throughout the code generation process, making it easier to pass it around when more flexibility is desired in the MacroAssembler to perform the lowering of certain instructions. As of this change, the responsibility of the `CodeGenContext` is to provide an interface for operations that require an orchestration between the register allocator, the value stack and function's frame. The MacroAssembler is removed from the CodeGenContext as is passed as a dependency where needed, effectly using it as an independent code generation interface only. By giving more responsibilities to the `CodeGenContext` we can clearly separate the concerns of the register allocator, which previously did more than it should (e.g. popping values and spilling). This change ultimately allows passing in the `CodeGenContext` to the `MacroAssembler` when a given instruction cannot be generically described through a common interface. Allowing each implementation to decide the best way to lower a particular instruction. * winch: Add support for the WebAssembly `<i32\|i64>.div_*` instructions Given that some architectures have very specific requirements on how to handle division, this change uses `CodeGenContext` as a dependency to the `div` MacroAssembler instruction to ensure that each implementation can decide on how to lower the division. This approach also allows -- in architectures where division can be expressed as an ordinary binary operation -- to rely on the `CodeGenContext::i32_binop` or `CodeGenContext::i64_binop` helpers.	2023-02-17 22:42:03 +00:00
Saúl Cabrera	426c49b8e3	winch: Use aarch64 backend for code emission. (#5652 ) This patch introduces basic aarch64 code generation by using `cranelift-codegen`'s backend. This commit does not: * Change the semantics of the code generation * Adds support for other Wasm instructions The most notable change in this patch is how addressing modes are handled at the MacroAssembler layer: instead of having a canonical address representation, this patch introduces the addressing mode as an associated type in the MacroAssembler trait. This approach has the advantage that gives each ISA enough flexiblity to describe the addressing modes and their constraints in isolation without having to worry on how a particular addressing mode is going to affect other ISAs. In the case of Aarch64 this becomes useful to describe indexed addressing modes (particularly from the stack pointer). This patch uses the concept of a shadow stack pointer (x28) as a workaround to Aarch64's stack pointer 16-byte alignment. This constraint is enforced by: * Introducing specialized addressing modes when using the real stack pointer; this enables auditing when the real stack pointer is used. As of this change, the real stack pointer is only used in the function's prologue and epilogue. * Asserting that the real stack pointer is not used as a base for addressing modes. * Ensuring that at any point during the code generation process where the stack pointer changes (e.g. when stack space is allocated / deallocated) the value of the real stack pointer is copied into the shadow stack pointer.	2023-02-02 14:24:11 -08:00
Saúl Cabrera	94b51cdb17	winch: Use cranelift-codegen x64 backend for emission. (#5581 ) This change substitutes the string based emission mechanism with cranelift-codegen's x64 backend. This change _does not_: * Introduce new functionality in terms of supported instructions. * Change the semantics of the assembler/macroassembler in terms of the logic to emit instructions. The most notable differences between this change and the previous version are: * Handling of shared flags and ISA-specific flags, which for now are left with the default value. * Simplification of instruction emission per operand size: previously the assembler defined different methods depending on the operand size (e.g. `mov` for 64 bits, and `movl` for 32 bits). This change updates such approach so that each assembler method takes an operand size as a parameter, reducing duplication and making the code more concise and easier to integrate with the x64's `Inst` enum. * Introduction of a disassembler for testing purposes. As of this change, Winch generates the following code for the following test programs: ```wat (module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: b8 0a 00 00 00 mov eax, 0xa 9: 83 c0 14 add eax, 0x14 c: 5d pop rbp d: c3 ret ``` ```wat (module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 83 ec 08 sub rsp, 8 8: 48 c7 04 24 00 00 00 00 mov qword ptr [rsp], 0 10: b8 0a 00 00 00 mov eax, 0xa 15: 89 44 24 04 mov dword ptr [rsp + 4], eax 19: b8 14 00 00 00 mov eax, 0x14 1e: 89 04 24 mov dword ptr [rsp], eax 21: 8b 04 24 mov eax, dword ptr [rsp] 24: 8b 4c 24 04 mov ecx, dword ptr [rsp + 4] 28: 01 c1 add ecx, eax 2a: 48 89 c8 mov rax, rcx 2d: 48 83 c4 08 add rsp, 8 31: 5d pop rbp 32: c3 ret ``` ```wat (module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 83 ec 08 sub rsp, 8 8: 89 7c 24 04 mov dword ptr [rsp + 4], edi c: 89 34 24 mov dword ptr [rsp], esi f: 8b 04 24 mov eax, dword ptr [rsp] 12: 8b 4c 24 04 mov ecx, dword ptr [rsp + 4] 16: 01 c1 add ecx, eax 18: 48 89 c8 mov rax, rcx 1b: 48 83 c4 08 add rsp, 8 1f: 5d pop rbp 20: c3 ret ```	2023-01-18 06:58:13 -05:00
Alex Crichton	3b9668558f	winch: Prepare for an update to the `wasm-tools` crates (#5238 ) This commit prepares the `winch` crate for updating `wasm-tools`, notably changing a bit about how the visitation of operators works. This moves the function body and wasm validator out of the `CodeGen` structure and into parameters threaded into the emission of the actual function. Additionally the `VisitOperator` implementation was updated to remove the explicit calls to the validator, favoring instead a macro-generated solution to guarantee that all validation happens before any translation proceeds. This means that the `VisitOperator for CodeGen` impl is now infallible and the various methods have been inlined into the trait methods as well as removing the `Result<_>`. Finally this commit updates translation to call `validator.finish(..)` which is required to perform the final validation steps of the function body.	2022-11-10 14:01:42 -06:00
Saúl Cabrera	835abbcd11	Initial skeleton for Winch (#4907 ) * Initial skeleton for Winch This commit introduces the initial skeleton for Winch, the "baseline" compiler. This skeleton contains mostly setup code for the ISA, ABI, registers, and compilation environment abstractions. It also includes the calculation of function local slots. As of this commit, the structure of these abstractions looks like the following: +------------------------+ \| v +----------+ +-----+ +-----------+-----+-----------------+ \| Compiler \| --> \| ISA \| --> \| Registers \| ABI \| Compilation Env \| +----------+ +-----+ +-----------+-----+-----------------+ \| ^ +------------------------------+ * Compilation environment will hold a reference to the function data * Add basic documentation to the ABI trait * Enable x86 and arm64 in cranelift-codegen * Add reg_name function for x64 * Introduce the concept of a MacroAssembler and Assembler This commit introduces the concept of a MacroAsesembler and Assembler. The MacroAssembler trait will provide a high enough interface across architectures so that each ISA implementation can use their own low-level Assembler implementation to fulfill the interface. Each Assembler will provide a 1-1 mapping to each ISA instruction. As of this commit, only a partial debug implementation is provided for the x64 Assembler. * Add a newtype over PReg Adds a newtype `Reg` over regalloc2::PReg; this ensures that Winch will operate only on the concept of `Reg`. This change is temporary until we have the necessary machinery to share a common Reg abstraction via `cranelift_asm` * Improvements to local calcuation - Add `LocalSlot::addressed_from_sp` - Use `u32` for local slot and local sizes calculation * Add helper methods to ABIArg Adds helper methods to retrieve register and type information from the argument * Make locals_size public in frame * Improve x64 register naming depending on size * Add new methods to the masm interface This commit introduces the ability for the MacroAssembler to reserve stack space, get the address of a given local and perform a stack store based on the concept of `Operand`s. There are several motivating factors to introduce the concept of an Operand: - Make the translation between Winch and Cranelift easier; - Make dispatching from the MacroAssembler to the underlying Assembler - easier by minimizing the amount of functions that we need to define - in order to satisfy the store/load combinations This commit also introduces the concept of a memory address, which essentially describes the addressing modes; as of this commit only one addressing mode is supported. We'll also need to verify that this structure will play nicely with arm64. * Blank masm implementation for arm64 * Implementation of reserve_stack, local_address, store and fp_offset for x64 * Implement function prologue and argument register spilling * Add structopt and wat * Fix debug instruction formatting * Make TargetISA trait publicly accessible * Modify the MacroAssembler finalize siganture to return a slice of strings * Introduce a simple CLI for Winch To be able to compile Wasm programs with Winch independently. Mostly meant for testing / debugging * Fix bug in x64 assembler mov_rm * Remove unused import * Move the stack slot calculation to the Frame This commit moves the calculation of the stack slots to the frame handler abstraction and also includes the calculation of the limits for the function defined locals, which will be used to zero the locals that are not associated to function arguments * Add i32 and i64 constructors to local slots * Introduce the concept of DefinedLocalsRange This commit introduces `DefinedLocalsRange` to track the stack offset at which the function-defined locals start and end; this is later used to zero-out that stack region * Add constructors for int and float registers * Add a placeholder stack implementation * Add a regset abstraction to track register availability Adds a bit set abstraction to track register availability for register allocation. The bit set has no specific knowledge about physical registers, it works on the register's hardware encoding as the source of truth. Each RegSet is expected to be created with the universe of allocatable registers per ISA when starting the compilation of a particular function. * Add an abstraction over register and immediate This is meant to be used as the source for stores. * Add a way to zero local slots and an initial skeletion of regalloc This commit introduces `zero_local_slots` to the MacroAssembler; which ensures that function defined locals are zeroed out when starting the function body. The algorithm divides the defined function locals stack range into 8 byte slots and stores a zero at each address. This process relies on register allocation if the amount of slots that need to be initialized is greater than 1. In such case, the next available register is requested to the register set and it's used to store a 0, which is then stored at every local slot * Update to wasmparser 0.92 * Correctly track if the regset has registers available * Add a result entry to the ABI signature This commuit introduces ABIResult as part of the ABISignature; this struct will track how function results are stored; initially it will consiste of a single register that will be requested to the register allocator at the end of the function; potentially causing a spill * Move zero local slots and add more granular methods to the masm This commit removes zeroing local slots from the MacroAssembler and instead adds more granular methods to it (e.g `zero`, `add`). This allows for better code sharing since most of the work done by the algorithm for zeroing slots will be the same in all targets, except for the binary emissions pieces, which is what gets delegated to the masm * Use wasmparser's visitor API and add initial support for const and add This commit adds initial support for the I32Const and I32 instructions; this involves adding a minimum for register allocation. Note that some regalloc pieces are still incomplete, since for the current set of supported instructions they are not needed. * Make the ty field public in Local * Add scratch_reg to the abi * Add a method to get a particular local from the Frame * Split the compilation environment abstraction This commit splits the compilation environment into two more concise abstractions: 1. CodeGen: the main abstraction for code generation 2. CodeGenContext: abstraction that shares the common pieces for compilation; these pieces are shared between the code generator and the register allocator * Add `push` and `load` to the MacroAssembler * Remove dead code warnings for unused paths * Map ISA features to cranelift-codegen ISA features * Apply formatting * Fix Cargo.toml after a bad rebase * Add component-compiler feature * Use clap instead of structopt * Add winch to publish.rs script * Minor formatting * Add tests to RegSet and fix two bugs when freeing and checking for register availability * Add tests to Stack * Free source register after a non-constant i32 add * Improve comments - Remove unneeded comments - And improve some of the TODO items * Update default features * Drop the ABI generic param and pass the word_size information directly To avoid dealing with dead code warnings this commit passes the word size information directly, since it's the only piece of information needed from the ABI by Codegen until now * Remove dead code This piece of code will be put back once we start integrating Winch with Wasmtime * Remove unused enum variant This variant doesn't get constructed; it should be added back once a backend is added and not enabled by default or when Winch gets integrated into Wasmtime * Fix unused code in regset tests * Update spec testsuite * Switch the visitor pattern for a simpler operator match This commit removes the usage of wasmparser's visitor pattern and instead defaults to a simpler operator matching approach. This removes the complexity of having to define all the visitor trait functions at once. * Use wasmparser's Visitor trait with a different macro strategy This commit puts back wasmparser's Visitor trait, with a sigle; simpler macro, only used for unsupported operators. * Restructure Winch This commit restuructures Winch's parts. It divides the initial approach into three main crates: `winch-codegen`,`wasmtime-winch` and `winch-tools`. `wasmtime-winch` is reponsible for the Wasmtime-Winch integration. `winch-codegen` is solely responsible for code generation. `winch-tools` is CLI tool to compile Wasm programs, mainly for testing purposes. * Refactor zero local slots This commit moves the logic of zeroing local slots from the codegen module into a method with a default implementation in the MacroAssembler trait: `zero_mem_range`. The refactored implementation is very similar to the previous implementation with the only difference that it doesn't allocates a general-purpose register; it instead uses the register allocator to retrieve the scratch register and uses this register to unroll the series of zero stores. * Tie the codegen creation to the ISA ABI This commit makes the relationship between the ISA ABI and the codegen explicit. This allows us to pass down ABI-specific bit and pieces to the codegeneration. In this case the only concrete piece that we need is the ABI word size. * Mark winch as publishable directory * Revamp winch docs This commit ensures that all the code comments in Winch are compliant with the syle used in the rest of Wasmtime's codebase. It also imptoves, generally the quality of the comments in some modules. * Panic when using multi-value when the target is aarch64 Similar to x64, this commit ensures that the abi signature of the current function doesn't use multi-value returns * Document the usage of directives * Use endianness instead of endianess in the ISA trait * Introduce a three-argument form in the MacroAssembler This commit introduces the usage of three-argument form for the MacroAssembler interface. This allows for a natural mapping for architectures like aarch64. In the case of x64, the implementation can simply restrict the implementation asserting for equality in two of the arguments of defaulting to a differnt set of instructions. As of this commit, the implementation of `add` panics if the destination and the first source arguments are not equal; internally the x64 assembler implementation will ensure that all the allowed combinations of `add` are satisfied. The reason for panicking and not emitting a `mov` followed by an `add` for example is simply because register allocation happens right before calling `add`, which ensures any register-to-register moves, if needed. This implementation will evolve in the future and this panic will be lifted if needed. * Improve the documentation for the MacroAssembler. Documents the usage of three-arg form and the intention around the high-level interface. * Format comments in remaining modules * Clean up Cargo.toml for winch pieces This commit adds missing fields to each of Winch's Cargo.toml. * Use `ModuleTranslation::get_types()` to derive the function type * Assert that start range is always word-size aligned	2022-10-28 14:19:34 -07:00

9 Commits