Files
wasmtime/cranelift/codegen/src/isa/unwind.rs
Chris Fallin 2d5db92a9e Rework/simplify unwind infrastructure and implement Windows unwind.
Our previous implementation of unwind infrastructure was somewhat
complex and brittle: it parsed generated instructions in order to
reverse-engineer unwind info from prologues. It also relied on some
fragile linkage to communicate instruction-layout information that VCode
was not designed to provide.

A much simpler, more reliable, and easier-to-reason-about approach is to
embed unwind directives as pseudo-instructions in the prologue as we
generate it. That way, we can say what we mean and just emit it
directly.

The usual reasoning that leads to the reverse-engineering approach is
that metadata is hard to keep in sync across optimization passes; but
here, (i) prologues are generated at the very end of the pipeline, and
(ii) if we ever do a post-prologue-gen optimization, we can treat unwind
directives as black boxes with unknown side-effects, just as we do for
some other pseudo-instructions today.

It turns out that it was easier to just build this for both x64 and
aarch64 (since they share a factored-out ABI implementation), and wire
up the platform-specific unwind-info generation for Windows and SystemV.
Now we have simpler unwind on all platforms and we can delete the old
unwind infra as soon as we remove the old backend.

There were a few consequences to supporting Fastcall unwind in
particular that led to a refactor of the common ABI. Windows only
supports naming clobbered-register save locations within 240 bytes of
the frame-pointer register, whatever one chooses that to be (RSP or
RBP). We had previously saved clobbers below the fixed frame (and below
nominal-SP). The 240-byte range has to include the old RBP too, so we're
forced to place clobbers at the top of the frame, just below saved
RBP/RIP. This is fine; we always keep a frame pointer anyway because we
use it to refer to stack args. It does mean that offsets of fixed-frame
slots (spillslots, stackslots) from RBP are no longer known before we do
regalloc, so if we ever want to index these off of RBP rather than
nominal-SP because we add support for `alloca` (dynamic frame growth),
then we'll need a "nominal-BP" mode that is resolved after regalloc and
clobber-save code is generated. I added a comment to this effect in
`abi_impl.rs`.

The above refactor touched both x64 and aarch64 because of shared code.
This had a further effect in that the old aarch64 prologue generation
subtracted from `sp` once to allocate space, then used stores to `[sp,
offset]` to save clobbers. Unfortunately the offset only has 7-bit
range, so if there are enough clobbered registers (and there can be --
aarch64 has 384 bytes of registers; at least one unit test hits this)
the stores/loads will be out-of-range. I really don't want to synthesize
large-offset sequences here; better to go back to the simpler
pre-index/post-index `stp r1, r2, [sp, #-16]` form that works just like
a "push". It's likely not much worse microarchitecturally (dependence
chain on SP, but oh well) and it actually saves an instruction if
there's no other frame to allocate. As a further advantage, it's much
simpler to understand; simpler is usually better.

This PR adds the new backend on Windows to CI as well.
2021-03-11 20:03:52 -08:00

238 lines
10 KiB
Rust

//! Represents information relating to function unwinding.
use regalloc::RealReg;
#[cfg(feature = "enable-serde")]
use serde::{Deserialize, Serialize};
#[cfg(feature = "unwind")]
pub mod systemv;
#[cfg(feature = "unwind")]
pub mod winx64;
/// Represents unwind information for a single function.
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
#[non_exhaustive]
pub enum UnwindInfo {
/// Windows x64 ABI unwind information.
#[cfg(feature = "unwind")]
WindowsX64(winx64::UnwindInfo),
/// System V ABI unwind information.
#[cfg(feature = "unwind")]
SystemV(systemv::UnwindInfo),
}
/// Intermediate representation for the unwind information
/// generated by a backend.
pub mod input {
use crate::binemit::CodeOffset;
use alloc::vec::Vec;
#[cfg(feature = "enable-serde")]
use serde::{Deserialize, Serialize};
/// Elementary operation in the unwind operations.
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
pub enum UnwindCode<Reg> {
/// Defines that a register is saved at the specified offset.
SaveRegister {
/// The saved register.
reg: Reg,
/// The specified offset relative to the stack pointer.
stack_offset: u32,
},
/// Defines that a register is as defined before call.
RestoreRegister {
/// The restored register.
reg: Reg,
},
/// The stack pointer was adjusted to allocate the stack.
StackAlloc {
/// Size to allocate.
size: u32,
},
/// The stack pointer was adjusted to free the stack.
StackDealloc {
/// Size to deallocate.
size: u32,
},
/// The alternative register was assigned as frame pointer base.
SetFramePointer {
/// The specified register.
reg: Reg,
},
/// Restores a frame pointer base to default register.
RestoreFramePointer,
/// Saves the state.
RememberState,
/// Restores the state.
RestoreState,
}
/// Unwind information as generated by a backend.
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
pub struct UnwindInfo<Reg> {
/// Size of the prologue.
pub prologue_size: CodeOffset,
/// Unwind codes for prologue.
pub prologue_unwind_codes: Vec<(CodeOffset, UnwindCode<Reg>)>,
/// Unwind codes for epilogues.
pub epilogues_unwind_codes: Vec<Vec<(CodeOffset, UnwindCode<Reg>)>>,
/// Entire function size.
pub function_size: CodeOffset,
/// Platform word size in bytes.
pub word_size: u8,
/// Initial stack pointer offset.
pub initial_sp_offset: u8,
}
}
/// Unwind pseudoinstruction used in VCode backends: represents that
/// at the present location, an action has just been taken.
///
/// VCode backends always emit unwind info that is relative to a frame
/// pointer, because we are planning to allow for dynamic frame allocation,
/// and because it makes the design quite a lot simpler in general: we don't
/// have to be precise about SP adjustments throughout the body of the function.
///
/// We include only unwind info for prologues at this time. Note that unwind
/// info for epilogues is only necessary if one expects to unwind while within
/// the last few instructions of the function (after FP has been restored) or
/// if one wishes to instruction-step through the epilogue and see a backtrace
/// at every point. This is not necessary for correct operation otherwise and so
/// we simplify the world a bit by omitting epilogue information. (Note that
/// some platforms also don't require or have a way to describe unwind
/// information for epilogues at all: for example, on Windows, the `UNWIND_INFO`
/// format only stores information for the function prologue.)
///
/// Because we are defining an abstraction over multiple unwind formats (at
/// least Windows/fastcall and System V) and multiple architectures (at least
/// x86-64 and aarch64), we have to be a little bit flexible in how we describe
/// the frame. However, it turns out that a least-common-denominator prologue
/// works for all of the cases we have to worry about today!
///
/// We assume the stack looks something like this:
///
///
/// ```plain
/// +----------------------------------------------+
/// | stack arg area, etc (according to ABI) |
/// | ... |
/// SP at call --> +----------------------------------------------+
/// | return address (pushed by HW or SW) |
/// +----------------------------------------------+
/// | old frame pointer (FP) |
/// FP in this --> +----------------------------------------------+
/// function | clobbered callee-save registers |
/// | ... |
/// start of --> +----------------------------------------------+
/// clobbers | (rest of function's frame, irrelevant here) |
/// | ... |
/// SP in this --> +----------------------------------------------+
/// function
/// ```
///
/// We assume that the prologue consists of:
///
/// * `PushFrameRegs`: A push operation that adds the old FP to the stack (and
/// maybe the link register, on architectures that do not push return addresses
/// in hardware)
/// * `DefineFrame`: An update that sets FP to SP to establish a new frame
/// * `SaveReg`: A number of stores or pushes to the stack to save clobbered registers
///
/// Each of these steps has a corresponding pseudo-instruction. At each step,
/// we need some information to determine where the current stack frame is
/// relative to SP or FP. When the `PushFrameRegs` occurs, we need to know how
/// much SP was decremented by, so we can allow the unwinder to continue to find
/// the caller's frame. When we define the new frame, we need to know where FP
/// is in relation to "SP at call" and also "start of clobbers", because
/// different unwind formats define one or the other of those as the anchor by
/// which we define the frame. Finally, when registers are saved, we need to
/// know which ones, and where.
///
/// Different unwind formats work differently; here is a whirlwind tour of how
/// they define frames to help understanding:
///
/// - Windows unwind information defines a frame that must start below the
/// clobber area, because all clobber-save offsets are non-negative. We set it
/// at the "start of clobbers" in the figure above. The `UNWIND_INFO` contains
/// a "frame pointer offset" field; when we define the new frame, the frame is
/// understood to be the value of FP (`RBP`) *minus* this offset. In other
/// words, the FP is *at the frame pointer offset* relative to the
/// start-of-clobber-frame. We use the "FP offset down to clobber area" offset
/// to generate this info.
///
/// - System V unwind information defines a frame in terms of the CFA
/// (call-frame address), which is equal to the "SP at call" above. SysV
/// allows negative offsets, so there is no issue defining clobber-save
/// locations in terms of CFA. The format allows us to define CFA flexibly in
/// terms of any register plus an offset; we define it in terms of FP plus
/// the clobber-to-caller-SP offset once FP is established.
///
/// Note that certain architectures impose limits on offsets: for example, on
/// Windows, the base of the clobber area must not be more than 240 bytes below
/// FP.
///
/// Unwind pseudoinstructions are emitted inline by ABI code as it generates
/// a prologue. Thus, for the usual case, a prologue might look like (using x64
/// as an example):
///
/// ```plain
/// push rbp
/// unwind UnwindInst::PushFrameRegs { offset_upward_to_caller_sp: 16 }
/// mov rbp, rsp
/// unwind UnwindInst::DefineNewFrame { offset_upward_to_caller_sp: 16,
/// offset_downward_to_clobbers: 16 }
/// sub rsp, 32
/// mov [rsp+16], r12
/// unwind UnwindInst::SaveReg { reg: R12, clobber_offset: 0 }
/// mov [rsp+24], r13
/// unwind UnwindInst::SaveReg { reg: R13, clobber_offset: 8 }
/// ...
/// ```
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
pub enum UnwindInst {
/// The frame-pointer register for this architecture has just been pushed to
/// the stack (and on architectures where return-addresses are not pushed by
/// hardware, the link register as well). The FP has not been set to this
/// frame yet. The current location of SP is such that
/// `offset_upward_to_caller_sp` is the distance to SP-at-callsite (our
/// caller's frame).
PushFrameRegs {
/// The offset from the current SP (after push) to the SP at
/// caller's callsite.
offset_upward_to_caller_sp: u32,
},
/// The frame-pointer register for this architecture has just been
/// set to the current stack location. We wish to define a new
/// frame that is anchored on this new FP value. Offsets are provided
/// upward to the caller's stack frame and downward toward the clobber
/// area. We expect this pseudo-op to come after `PushFrameRegs`.
DefineNewFrame {
/// The offset from the current SP and FP value upward to the value of
/// SP at the callsite that invoked us.
offset_upward_to_caller_sp: u32,
/// The offset from the current SP and FP value downward to the start of
/// the clobber area.
offset_downward_to_clobbers: u32,
},
/// The stack slot at the given offset from the clobber-area base has been
/// used to save the given register.
///
/// Given that `CreateFrame` has occurred first with some
/// `offset_downward_to_clobbers`, `SaveReg` with `clobber_offset` indicates
/// that the value of `reg` is saved on the stack at address `FP -
/// offset_downward_to_clobbers + clobber_offset`.
SaveReg {
/// The offset from the start of the clobber area to this register's
/// stack location.
clobber_offset: u32,
/// The saved register.
reg: RealReg,
},
}