Files
wasmtime/winch/codegen/src/masm.rs
Kevin Rizzo 3a92aa3d0a winch: Initial integration with wasmtime (#6119)
* Adding in trampoline compiling method for ISA

* Adding support for indirect call to memory address

* Refactoring frame to externalize defined locals, so it removes WASM depedencies in trampoline case

* Adding initial version of trampoline for testing

* Refactoring trampoline to be re-used by other architectures

* Initial wiring for winch with wasmtime

* Add a Wasmtime CLI option to select `winch`

This is effectively an option to select the `Strategy` enumeration.

* Implement `Compiler::compile_function` for Winch

Hook this into the `TargetIsa::compile_function` hook as well. Currently
this doesn't take into account `Tunables`, but that's left as a TODO for
later.

* Filling out Winch append_code method

* Adding back in changes from previous branch

Most of these are a WIP. It's missing trampolines for x64, but a basic
one exists for aarch64. It's missing the handling of arguments that
exist on the stack.

It currently imports `cranelift_wasm::WasmFuncType` since it's what's
passed to the `Compiler` trait. It's a bit awkward to use in the
`winch_codegen` crate since it mostly operates on `wasmparser` types.
I've had to hack in a conversion to get things working. Long term, I'm
not sure it's wise to rely on this type but it seems like it's easier on
the Cranelift side when creating the stub IR.

* Small API changes to make integration easier

* Adding in new FuncEnv, only a stub for now

* Removing unneeded parts of the old PoC, and refactoring trampoline code

* Moving FuncEnv into a separate file

* More comments for trampolines

* Adding in winch integration tests for first pass

* Using new addressing method to fix stack pointer error

* Adding test for stack arguments

* Only run tests on x86 for now, it's more complete for winch

* Add in missing documentation after rebase

* Updating based on feedback in draft PR

* Fixing formatting on doc comment for argv register

* Running formatting

* Lock updates, and turning on winch feature flags during tests

* Updating configuration with comments to no longer gate Strategy enum

* Using the winch-environ FuncEnv, but it required changing the sig

* Proper comment formatting

* Removing wasmtime-winch from dev-dependencies, adding the winch feature makes this not necessary

* Update doc attr to include winch check

* Adding winch feature to doc generation, which seems to fix the feature error in CI

* Add the `component-model` feature to the cargo doc invocation in CI

To match the metadata used by the docs.rs invocation when building docs.

* Add a comment clarifying the usage of `component-model` for docs.rs

* Correctly order wasmtime-winch and winch-environ in the publish script

* Ensure x86 test dependencies are included in cfg(target_arch)

* Further constrain Winch tests to x86_64 _and_ unix

---------

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>
2023-04-05 00:32:40 +00:00

214 lines
7.1 KiB
Rust

use crate::abi::{align_to, LocalSlot};
use crate::codegen::CodeGenContext;
use crate::isa::reg::Reg;
use crate::regalloc::RegAlloc;
use cranelift_codegen::{Final, MachBufferFinalized};
use std::{fmt::Debug, ops::Range};
#[derive(Eq, PartialEq)]
pub(crate) enum DivKind {
/// Signed division.
Signed,
/// Unsigned division.
Unsigned,
}
/// Remainder kind.
pub(crate) enum RemKind {
/// Signed remainder.
Signed,
/// Unsigned remainder.
Unsigned,
}
/// Operand size, in bits.
#[derive(Copy, Debug, Clone, Eq, PartialEq)]
pub(crate) enum OperandSize {
/// 32 bits.
S32,
/// 64 bits.
S64,
}
/// An abstraction over a register or immediate.
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub(crate) enum RegImm {
/// A register.
Reg(Reg),
/// 64-bit signed immediate.
Imm(i64),
}
pub(crate) enum CalleeKind {
/// A function call to a raw address.
Indirect(Reg),
/// A function call to a local function.
Direct(u32),
}
impl RegImm {
/// Register constructor.
pub fn reg(r: Reg) -> Self {
RegImm::Reg(r)
}
/// Immediate constructor.
pub fn imm(imm: i64) -> Self {
RegImm::Imm(imm)
}
}
impl From<Reg> for RegImm {
fn from(r: Reg) -> Self {
Self::Reg(r)
}
}
/// Generic MacroAssembler interface used by the code generation.
///
/// The MacroAssembler trait aims to expose an interface, high-level enough,
/// so that each ISA can provide its own lowering to machine code. For example,
/// for WebAssembly operators that don't have a direct mapping to a machine
/// a instruction, the interface defines a signature matching the WebAssembly
/// operator, allowing each implementation to lower such operator entirely.
/// This approach attributes more responsibility to the MacroAssembler, but frees
/// the caller from concerning about assembling the right sequence of
/// instructions at the operator callsite.
///
/// The interface defaults to a three-argument form for binary operations;
/// this allows a natural mapping to instructions for RISC architectures,
/// that use three-argument form.
/// This approach allows for a more general interface that can be restricted
/// where needed, in the case of architectures that use a two-argument form.
pub(crate) trait MacroAssembler {
/// The addressing mode.
type Address;
/// Emit the function prologue.
fn prologue(&mut self);
/// Emit the function epilogue.
fn epilogue(&mut self, locals_size: u32);
/// Reserve stack space.
fn reserve_stack(&mut self, bytes: u32);
/// Free stack space.
fn free_stack(&mut self, bytes: u32);
/// Get the address of a local slot.
fn local_address(&mut self, local: &LocalSlot) -> Self::Address;
/// Constructs an address with an offset that is relative to the
/// current position of the stack pointer (e.g. [sp + (sp_offset -
/// offset)].
fn address_from_sp(&self, offset: u32) -> Self::Address;
/// Constructs an address with an offset that is absolute to the
/// current position of the stack pointer (e.g. [sp + offset].
fn address_at_sp(&self, offset: u32) -> Self::Address;
/// Construct an address that is relative to the given register.
fn address_from_reg(&self, reg: Reg, offset: u32) -> Self::Address;
/// Emit a function call to either a local or external function.
fn call(&mut self, callee: CalleeKind);
/// Get stack pointer offset.
fn sp_offset(&self) -> u32;
/// Perform a stack store.
fn store(&mut self, src: RegImm, dst: Self::Address, size: OperandSize);
/// Perform a stack load.
fn load(&mut self, src: Self::Address, dst: Reg, size: OperandSize);
/// Pop a value from the machine stack into the given register.
fn pop(&mut self, dst: Reg);
/// Perform a move.
fn mov(&mut self, src: RegImm, dst: RegImm, size: OperandSize);
/// Perform add operation.
fn add(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize);
/// Perform subtraction operation.
fn sub(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize);
/// Perform multiplication operation.
fn mul(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize);
/// Perform division operation.
/// Division is special in that some architectures have specific
/// expectations regarding the location of the instruction
/// arguments and regarding the location of the quotient /
/// remainder. To free the caller from having to deal with the
/// architecure specific contraints we give this function access
/// to the code generation context, allowing each implementation
/// to decide the lowering path. For cases in which division is a
/// unconstrained binary operation, the caller can decide to use
/// the `CodeGenContext::i32_binop` or `CodeGenContext::i64_binop`
/// functions.
fn div(&mut self, context: &mut CodeGenContext, kind: DivKind, size: OperandSize);
/// Calculate remainder.
fn rem(&mut self, context: &mut CodeGenContext, kind: RemKind, size: OperandSize);
/// Push the register to the stack, returning the offset.
fn push(&mut self, src: Reg) -> u32;
/// Finalize the assembly and return the result.
fn finalize(self) -> MachBufferFinalized<Final>;
/// Zero a particular register.
fn zero(&mut self, reg: Reg);
/// Zero a given memory range.
///
/// The default implementation divides the given memory range
/// into word-sized slots. Then it unrolls a series of store
/// instructions, effectively assigning zero to each slot.
fn zero_mem_range(&mut self, mem: &Range<u32>, word_size: u32, regalloc: &mut RegAlloc) {
if mem.is_empty() {
return;
}
let start = if mem.start % word_size == 0 {
mem.start
} else {
// Ensure that the start of the range is at least 4-byte aligned.
assert!(mem.start % 4 == 0);
let start = align_to(mem.start, word_size);
let addr: Self::Address = self.local_address(&LocalSlot::i32(start));
self.store(RegImm::imm(0), addr, OperandSize::S32);
// Ensure that the new start of the range, is word-size aligned.
assert!(start % word_size == 0);
start
};
let end = align_to(mem.end, word_size);
let slots = (end - start) / word_size;
if slots == 1 {
let slot = LocalSlot::i64(start + word_size);
let addr: Self::Address = self.local_address(&slot);
self.store(RegImm::imm(0), addr, OperandSize::S64);
} else {
// TODO
// Add an upper bound to this generation;
// given a considerably large amount of slots
// this will be inefficient.
let zero = regalloc.scratch;
self.zero(zero);
let zero = RegImm::reg(zero);
for step in (start..end).into_iter().step_by(word_size as usize) {
let slot = LocalSlot::i64(step + word_size);
let addr: Self::Address = self.local_address(&slot);
self.store(zero, addr, OperandSize::S64);
}
}
}
}