Rework/simplify unwind infrastructure and implement Windows unwind.

Our previous implementation of unwind infrastructure was somewhat
complex and brittle: it parsed generated instructions in order to
reverse-engineer unwind info from prologues. It also relied on some
fragile linkage to communicate instruction-layout information that VCode
was not designed to provide.

A much simpler, more reliable, and easier-to-reason-about approach is to
embed unwind directives as pseudo-instructions in the prologue as we
generate it. That way, we can say what we mean and just emit it
directly.

The usual reasoning that leads to the reverse-engineering approach is
that metadata is hard to keep in sync across optimization passes; but
here, (i) prologues are generated at the very end of the pipeline, and
(ii) if we ever do a post-prologue-gen optimization, we can treat unwind
directives as black boxes with unknown side-effects, just as we do for
some other pseudo-instructions today.

It turns out that it was easier to just build this for both x64 and
aarch64 (since they share a factored-out ABI implementation), and wire
up the platform-specific unwind-info generation for Windows and SystemV.
Now we have simpler unwind on all platforms and we can delete the old
unwind infra as soon as we remove the old backend.

There were a few consequences to supporting Fastcall unwind in
particular that led to a refactor of the common ABI. Windows only
supports naming clobbered-register save locations within 240 bytes of
the frame-pointer register, whatever one chooses that to be (RSP or
RBP). We had previously saved clobbers below the fixed frame (and below
nominal-SP). The 240-byte range has to include the old RBP too, so we're
forced to place clobbers at the top of the frame, just below saved
RBP/RIP. This is fine; we always keep a frame pointer anyway because we
use it to refer to stack args. It does mean that offsets of fixed-frame
slots (spillslots, stackslots) from RBP are no longer known before we do
regalloc, so if we ever want to index these off of RBP rather than
nominal-SP because we add support for `alloca` (dynamic frame growth),
then we'll need a "nominal-BP" mode that is resolved after regalloc and
clobber-save code is generated. I added a comment to this effect in
`abi_impl.rs`.

The above refactor touched both x64 and aarch64 because of shared code.
This had a further effect in that the old aarch64 prologue generation
subtracted from `sp` once to allocate space, then used stores to `[sp,
offset]` to save clobbers. Unfortunately the offset only has 7-bit
range, so if there are enough clobbered registers (and there can be --
aarch64 has 384 bytes of registers; at least one unit test hits this)
the stores/loads will be out-of-range. I really don't want to synthesize
large-offset sequences here; better to go back to the simpler
pre-index/post-index `stp r1, r2, [sp, #-16]` form that works just like
a "push". It's likely not much worse microarchitecturally (dependence
chain on SP, but oh well) and it actually saves an instruction if
there's no other frame to allocate. As a further advantage, it's much
simpler to understand; simpler is usually better.

This PR adds the new backend on Windows to CI as well.
This commit is contained in:
Chris Fallin
2021-03-06 15:43:09 -08:00
parent 05688aa8f4
commit 2d5db92a9e
63 changed files with 905 additions and 1088 deletions

View File

@@ -3,7 +3,7 @@
use crate::ir::types::*;
use crate::ir::{self, types, ExternalName, LibCall, MemFlags, Opcode, TrapCode, Type};
use crate::isa;
use crate::isa::{x64::inst::*, CallConv};
use crate::isa::{unwind::UnwindInst, x64::inst::*, CallConv};
use crate::machinst::abi_impl::*;
use crate::machinst::*;
use crate::settings;
@@ -433,25 +433,38 @@ impl ABIMachineSpec for X64ABIMachineSpec {
}
}
fn gen_prologue_frame_setup() -> SmallInstVec<Self::I> {
fn gen_prologue_frame_setup(flags: &settings::Flags) -> SmallInstVec<Self::I> {
let r_rsp = regs::rsp();
let r_rbp = regs::rbp();
let w_rbp = Writable::from_reg(r_rbp);
let mut insts = SmallVec::new();
// `push %rbp`
// RSP before the call will be 0 % 16. So here, it is 8 % 16.
insts.push(Inst::push64(RegMemImm::reg(r_rbp)));
if flags.unwind_info() {
insts.push(Inst::Unwind {
inst: UnwindInst::PushFrameRegs {
offset_upward_to_caller_sp: 16, // RBP, return address
},
});
}
// `mov %rsp, %rbp`
// RSP is now 0 % 16
insts.push(Inst::mov_r_r(OperandSize::Size64, r_rsp, w_rbp));
insts
}
fn gen_epilogue_frame_restore() -> SmallInstVec<Self::I> {
fn gen_epilogue_frame_restore(_: &settings::Flags) -> SmallInstVec<Self::I> {
let mut insts = SmallVec::new();
// `mov %rbp, %rsp`
insts.push(Inst::mov_r_r(
OperandSize::Size64,
regs::rbp(),
Writable::from_reg(regs::rsp()),
));
// `pop %rbp`
insts.push(Inst::pop64(Writable::from_reg(regs::rbp())));
insts
}
@@ -474,22 +487,31 @@ impl ABIMachineSpec for X64ABIMachineSpec {
fn gen_clobber_save(
call_conv: isa::CallConv,
_: &settings::Flags,
flags: &settings::Flags,
clobbers: &Set<Writable<RealReg>>,
fixed_frame_storage_size: u32,
_outgoing_args_size: u32,
) -> (u64, SmallVec<[Self::I; 16]>) {
let mut insts = SmallVec::new();
// Find all clobbered registers that are callee-save. These are only I64
// registers (all XMM registers are caller-save) so we can compute the
// total size of the needed stack space easily.
// Find all clobbered registers that are callee-save.
let clobbered = get_callee_saves(&call_conv, clobbers);
let stack_size = compute_clobber_size(&clobbered) + fixed_frame_storage_size;
// Align to 16 bytes.
let stack_size = align_to(stack_size, 16);
let clobbered_size = stack_size - fixed_frame_storage_size;
// Adjust the stack pointer downward with one `sub rsp, IMM`
// instruction.
let clobbered_size = compute_clobber_size(&clobbered);
if flags.unwind_info() {
// Emit unwind info: start the frame. The frame (from unwind
// consumers' point of view) starts at clobbbers, just below
// the FP and return address. Spill slots and stack slots are
// part of our actual frame but do not concern the unwinder.
insts.push(Inst::Unwind {
inst: UnwindInst::DefineNewFrame {
offset_downward_to_clobbers: clobbered_size,
offset_upward_to_caller_sp: 16, // RBP, return address
},
});
}
// Adjust the stack pointer downward for clobbers and the function fixed
// frame (spillslots and storage slots).
let stack_size = fixed_frame_storage_size + clobbered_size;
if stack_size > 0 {
insts.push(Inst::alu_rmi_r(
OperandSize::Size64,
@@ -498,10 +520,12 @@ impl ABIMachineSpec for X64ABIMachineSpec {
Writable::from_reg(regs::rsp()),
));
}
// Store each clobbered register in order at offsets from RSP.
let mut cur_offset = 0;
// Store each clobbered register in order at offsets from RSP,
// placing them above the fixed frame slots.
let mut cur_offset = fixed_frame_storage_size;
for reg in &clobbered {
let r_reg = reg.to_reg();
let off = cur_offset;
match r_reg.get_class() {
RegClass::I64 => {
insts.push(Inst::store(
@@ -521,6 +545,14 @@ impl ABIMachineSpec for X64ABIMachineSpec {
cur_offset += 16;
}
_ => unreachable!(),
};
if flags.unwind_info() {
insts.push(Inst::Unwind {
inst: UnwindInst::SaveReg {
clobber_offset: off - fixed_frame_storage_size,
reg: r_reg,
},
});
}
}
@@ -531,17 +563,17 @@ impl ABIMachineSpec for X64ABIMachineSpec {
call_conv: isa::CallConv,
flags: &settings::Flags,
clobbers: &Set<Writable<RealReg>>,
_fixed_frame_storage_size: u32,
_outgoing_args_size: u32,
fixed_frame_storage_size: u32,
) -> SmallVec<[Self::I; 16]> {
let mut insts = SmallVec::new();
let clobbered = get_callee_saves(&call_conv, clobbers);
let stack_size = compute_clobber_size(&clobbered);
let stack_size = align_to(stack_size, 16);
let stack_size = fixed_frame_storage_size + compute_clobber_size(&clobbered);
// Restore regs by loading from offsets of RSP.
let mut cur_offset = 0;
// Restore regs by loading from offsets of RSP. RSP will be
// returned to nominal-RSP at this point, so we can use the
// same offsets that we used when saving clobbers above.
let mut cur_offset = fixed_frame_storage_size;
for reg in &clobbered {
let rreg = reg.to_reg();
match rreg.get_class() {
@@ -990,5 +1022,5 @@ fn compute_clobber_size(clobbers: &Vec<Writable<RealReg>>) -> u32 {
_ => unreachable!(),
}
}
clobbered_size
align_to(clobbered_size, 16)
}

View File

@@ -3050,6 +3050,10 @@ pub(crate) fn emit(
Inst::ValueLabelMarker { .. } => {
// Nothing; this is only used to compute debug info.
}
Inst::Unwind { ref inst } => {
sink.add_unwind(inst.clone());
}
}
state.clear_post_insn();

View File

@@ -2,6 +2,7 @@
use crate::binemit::{CodeOffset, StackMap};
use crate::ir::{types, ExternalName, Opcode, SourceLoc, TrapCode, Type, ValueLabel};
use crate::isa::unwind::UnwindInst;
use crate::isa::x64::abi::X64ABIMachineSpec;
use crate::isa::x64::settings as x64_settings;
use crate::isa::CallConv;
@@ -488,6 +489,10 @@ pub enum Inst {
/// A definition of a value label.
ValueLabelMarker { reg: Reg, label: ValueLabel },
/// An unwind pseudoinstruction describing the state of the
/// machine at this program point.
Unwind { inst: UnwindInst },
}
pub(crate) fn low32_will_sign_extend_to_64(x: u64) -> bool {
@@ -548,7 +553,8 @@ impl Inst {
| Inst::XmmUninitializedValue { .. }
| Inst::ElfTlsGetAddr { .. }
| Inst::MachOTlsGetAddr { .. }
| Inst::ValueLabelMarker { .. } => None,
| Inst::ValueLabelMarker { .. }
| Inst::Unwind { .. } => None,
Inst::UnaryRmR { op, .. } => op.available_from(),
@@ -1787,6 +1793,10 @@ impl PrettyPrint for Inst {
Inst::ValueLabelMarker { label, reg } => {
format!("value_label {:?}, {}", label, reg.show_rru(mb_rru))
}
Inst::Unwind { inst } => {
format!("unwind {:?}", inst)
}
}
}
}
@@ -2065,6 +2075,8 @@ fn x64_get_regs(inst: &Inst, collector: &mut RegUsageCollector) {
Inst::ValueLabelMarker { reg, .. } => {
collector.add_use(*reg);
}
Inst::Unwind { .. } => {}
}
}
@@ -2459,7 +2471,8 @@ fn x64_map_regs<RUM: RegUsageMapper>(inst: &mut Inst, mapper: &RUM) {
| Inst::AtomicRmwSeq { .. }
| Inst::ElfTlsGetAddr { .. }
| Inst::MachOTlsGetAddr { .. }
| Inst::Fence { .. } => {
| Inst::Fence { .. }
| Inst::Unwind { .. } => {
// Instruction doesn't explicitly mention any regs, so it can't have any virtual
// regs that we'd need to remap. Hence no action required.
}
@@ -2776,7 +2789,6 @@ impl MachInstEmitInfo for EmitInfo {
impl MachInstEmit for Inst {
type State = EmitState;
type Info = EmitInfo;
type UnwindInfo = unwind::X64UnwindInfo;
fn emit(&self, sink: &mut MachBuffer<Inst>, info: &Self::Info, state: &mut Self::State) {
emit::emit(self, sink, info, state);

View File

@@ -1,125 +1,5 @@
use crate::isa::unwind::input::UnwindInfo;
use crate::isa::x64::inst::{
args::{AluRmiROpcode, Amode, OperandSize, RegMemImm, SyntheticAmode},
regs, Inst,
};
use crate::machinst::{UnwindInfoContext, UnwindInfoGenerator};
use crate::result::CodegenResult;
use alloc::vec::Vec;
use regalloc::Reg;
#[cfg(feature = "unwind")]
pub(crate) mod systemv;
pub struct X64UnwindInfo;
impl UnwindInfoGenerator<Inst> for X64UnwindInfo {
fn create_unwind_info(
context: UnwindInfoContext<Inst>,
) -> CodegenResult<Option<UnwindInfo<Reg>>> {
use crate::isa::unwind::input::{self, UnwindCode};
let mut codes = Vec::new();
const WORD_SIZE: u8 = 8;
for i in context.prologue.clone() {
let i = i as usize;
let inst = &context.insts[i];
let offset = context.insts_layout[i];
match inst {
Inst::Push64 {
src: RegMemImm::Reg { reg },
} => {
codes.push((
offset,
UnwindCode::StackAlloc {
size: WORD_SIZE.into(),
},
));
codes.push((
offset,
UnwindCode::SaveRegister {
reg: *reg,
stack_offset: 0,
},
));
}
Inst::MovRR { src, dst, .. } => {
if *src == regs::rsp() {
codes.push((offset, UnwindCode::SetFramePointer { reg: dst.to_reg() }));
}
}
Inst::AluRmiR {
size: OperandSize::Size64,
op: AluRmiROpcode::Sub,
src: RegMemImm::Imm { simm32 },
dst,
..
} if dst.to_reg() == regs::rsp() => {
let imm = *simm32;
codes.push((offset, UnwindCode::StackAlloc { size: imm }));
}
Inst::MovRM {
src,
dst: SyntheticAmode::Real(Amode::ImmReg { simm32, base, .. }),
..
} if *base == regs::rsp() => {
// `mov reg, imm(rsp)`
let imm = *simm32;
codes.push((
offset,
UnwindCode::SaveRegister {
reg: *src,
stack_offset: imm,
},
));
}
Inst::AluRmiR {
size: OperandSize::Size64,
op: AluRmiROpcode::Add,
src: RegMemImm::Imm { simm32 },
dst,
..
} if dst.to_reg() == regs::rsp() => {
let imm = *simm32;
codes.push((offset, UnwindCode::StackDealloc { size: imm }));
}
_ => {}
}
}
let last_epilogue_end = context.len;
let epilogues_unwind_codes = context
.epilogues
.iter()
.map(|epilogue| {
// TODO add logic to process epilogue instruction instead of
// returning empty array.
let end = epilogue.end as usize - 1;
let end_offset = context.insts_layout[end];
if end_offset == last_epilogue_end {
// Do not remember/restore for very last epilogue.
return vec![];
}
let start = epilogue.start as usize;
let offset = context.insts_layout[start];
vec![
(offset, UnwindCode::RememberState),
// TODO epilogue instructions
(end_offset, UnwindCode::RestoreState),
]
})
.collect();
let prologue_size = context.insts_layout[context.prologue.end as usize];
Ok(Some(input::UnwindInfo {
prologue_size,
prologue_unwind_codes: codes,
epilogues_unwind_codes,
function_size: context.len,
word_size: WORD_SIZE,
initial_sp_offset: WORD_SIZE,
}))
}
}
#[cfg(feature = "unwind")]
pub(crate) mod winx64;

View File

@@ -1,8 +1,6 @@
//! Unwind information for System V ABI (x86-64).
use crate::isa::unwind::input;
use crate::isa::unwind::systemv::{RegisterMappingError, UnwindInfo};
use crate::result::CodegenResult;
use crate::isa::unwind::systemv::RegisterMappingError;
use gimli::{write::CommonInformationEntry, Encoding, Format, Register, X86_64};
use regalloc::{Reg, RegClass};
@@ -82,21 +80,18 @@ pub fn map_reg(reg: Reg) -> Result<Register, RegisterMappingError> {
}
}
pub(crate) fn create_unwind_info(
unwind: input::UnwindInfo<Reg>,
) -> CodegenResult<Option<UnwindInfo>> {
struct RegisterMapper;
impl crate::isa::unwind::systemv::RegisterMapper<Reg> for RegisterMapper {
fn map(&self, reg: Reg) -> Result<u16, RegisterMappingError> {
Ok(map_reg(reg)?.0)
}
fn sp(&self) -> u16 {
X86_64::RSP.0
}
}
let map = RegisterMapper;
pub(crate) struct RegisterMapper;
Ok(Some(UnwindInfo::build(unwind, &map)?))
impl crate::isa::unwind::systemv::RegisterMapper<Reg> for RegisterMapper {
fn map(&self, reg: Reg) -> Result<u16, RegisterMappingError> {
Ok(map_reg(reg)?.0)
}
fn sp(&self) -> u16 {
X86_64::RSP.0
}
fn fp(&self) -> u16 {
X86_64::RBP.0
}
}
#[cfg(test)]
@@ -136,7 +131,7 @@ mod tests {
_ => panic!("expected unwind information"),
};
assert_eq!(format!("{:?}", fde), "FrameDescriptionEntry { address: Constant(1234), length: 13, lsda: None, instructions: [(1, CfaOffset(16)), (1, Offset(Register(6), -16)), (4, CfaRegister(Register(6)))] }");
assert_eq!(format!("{:?}", fde), "FrameDescriptionEntry { address: Constant(1234), length: 17, lsda: None, instructions: [(1, CfaOffset(16)), (1, Offset(Register(6), -16)), (4, CfaRegister(Register(6)))] }");
}
fn create_function(call_conv: CallConv, stack_slot: Option<StackSlotData>) -> Function {
@@ -175,7 +170,7 @@ mod tests {
_ => panic!("expected unwind information"),
};
assert_eq!(format!("{:?}", fde), "FrameDescriptionEntry { address: Constant(4321), length: 22, lsda: None, instructions: [(1, CfaOffset(16)), (1, Offset(Register(6), -16)), (4, CfaRegister(Register(6))), (15, RememberState), (17, RestoreState)] }");
assert_eq!(format!("{:?}", fde), "FrameDescriptionEntry { address: Constant(4321), length: 22, lsda: None, instructions: [(1, CfaOffset(16)), (1, Offset(Register(6), -16)), (4, CfaRegister(Register(6)))] }");
}
fn create_multi_return_function(call_conv: CallConv) -> Function {

View File

@@ -0,0 +1,16 @@
//! Unwind information for Windows x64 ABI.
use regalloc::{Reg, RegClass};
pub(crate) struct RegisterMapper;
impl crate::isa::unwind::winx64::RegisterMapper<Reg> for RegisterMapper {
fn map(reg: Reg) -> crate::isa::unwind::winx64::MappedRegister {
use crate::isa::unwind::winx64::MappedRegister;
match reg.get_class() {
RegClass::I64 => MappedRegister::Int(reg.get_hw_encoding()),
RegClass::V128 => MappedRegister::Xmm(reg.get_hw_encoding()),
_ => unreachable!(),
}
}
}

View File

@@ -4,7 +4,6 @@ use self::inst::EmitInfo;
use super::TargetIsa;
use crate::ir::{condcodes::IntCC, Function};
use crate::isa::unwind::systemv::RegisterMappingError;
use crate::isa::x64::{inst::regs::create_reg_universe_systemv, settings as x64_settings};
use crate::isa::Builder as IsaBuilder;
use crate::machinst::{compile, MachBackend, MachCompileResult, TargetIsaAdapter, VCode};
@@ -15,6 +14,9 @@ use core::hash::{Hash, Hasher};
use regalloc::{PrettyPrint, RealRegUniverse, Reg};
use target_lexicon::Triple;
#[cfg(feature = "unwind")]
use crate::isa::unwind::systemv;
mod abi;
mod inst;
mod lower;
@@ -61,7 +63,6 @@ impl MachBackend for X64Backend {
let buffer = vcode.emit();
let buffer = buffer.finish();
let frame_size = vcode.frame_size();
let unwind_info = vcode.unwind_info()?;
let value_labels_ranges = vcode.value_labels_ranges();
let stackslot_offsets = vcode.stackslot_offsets().clone();
@@ -75,7 +76,6 @@ impl MachBackend for X64Backend {
buffer,
frame_size,
disasm,
unwind_info,
value_labels_ranges,
stackslot_offsets,
})
@@ -122,14 +122,22 @@ impl MachBackend for X64Backend {
) -> CodegenResult<Option<crate::isa::unwind::UnwindInfo>> {
use crate::isa::unwind::UnwindInfo;
use crate::machinst::UnwindInfoKind;
Ok(match (result.unwind_info.as_ref(), kind) {
(Some(info), UnwindInfoKind::SystemV) => {
inst::unwind::systemv::create_unwind_info(info.clone())?.map(UnwindInfo::SystemV)
}
(Some(_info), UnwindInfoKind::Windows) => {
//TODO inst::unwind::winx64::create_unwind_info(info.clone())?.map(|u| UnwindInfo::WindowsX64(u))
None
Ok(match kind {
UnwindInfoKind::SystemV => {
let mapper = self::inst::unwind::systemv::RegisterMapper;
Some(UnwindInfo::SystemV(
crate::isa::unwind::systemv::create_unwind_info_from_insts(
&result.buffer.unwind_info[..],
result.buffer.data.len(),
&mapper,
)?,
))
}
UnwindInfoKind::Windows => Some(UnwindInfo::WindowsX64(
crate::isa::unwind::winx64::create_unwind_info_from_insts::<
self::inst::unwind::winx64::RegisterMapper,
>(&result.buffer.unwind_info[..])?,
)),
_ => None,
})
}
@@ -140,7 +148,7 @@ impl MachBackend for X64Backend {
}
#[cfg(feature = "unwind")]
fn map_reg_to_dwarf(&self, reg: Reg) -> Result<u16, RegisterMappingError> {
fn map_reg_to_dwarf(&self, reg: Reg) -> Result<u16, systemv::RegisterMappingError> {
inst::unwind::systemv::map_reg(reg).map(|reg| reg.0)
}
}