Implement interrupting wasm code, reimplement stack overflow (#1490)

* Implement interrupting wasm code, reimplement stack overflow

This commit is a relatively large change for wasmtime with two main
goals:

* Primarily this enables interrupting executing wasm code with a trap,
  preventing infinite loops in wasm code. Note that resumption of the
  wasm code is not a goal of this commit.

* Additionally this commit reimplements how we handle stack overflow to
  ensure that host functions always have a reasonable amount of stack to
  run on. This fixes an issue where we might longjmp out of a host
  function, skipping destructors.

Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.

A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.

When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.

Some fallout included from this change is:

* Unix signal handlers are no longer registered with `SA_ONSTACK`.
  Instead they run on the native stack the thread was already using.
  This is possible since stack overflow isn't handled by hitting the
  guard page, but rather it's explicitly checked for in wasm now. Native
  stack overflow will continue to abort the process as usual.

* Unix sigaltstack management is now no longer necessary since we don't
  use it any more.

* Windows no longer has any need to reset guard pages since we no longer
  try to recover from faults on guard pages.

* On all targets probestack intrinsics are disabled since we use a
  different mechanism for catching stack overflow.

* The C API has been updated with interrupts handles. An example has
  also been added which shows off how to interrupt a module.

Closes #139
Closes #860
Closes #900

* Update comment about magical interrupt value

* Store stack limit as a global value, not a closure

* Run rustfmt

* Handle review comments

* Add a comment about SA_ONSTACK

* Use `usize` for type of `INTERRUPTED`

* Parse human-readable durations

* Bring back sigaltstack handling

Allows libstd to print out stack overflow on failure still.

* Add parsing and emission of stack limit-via-preamble

* Fix new example for new apis

* Fix host segfault test in release mode

* Fix new doc example
This commit is contained in:
Alex Crichton
2020-04-21 13:03:28 -05:00
committed by GitHub
parent 4a63a4d86e
commit c9a0ba81a0
45 changed files with 1361 additions and 143 deletions

View File

@@ -685,21 +685,32 @@ fn insert_common_prologue(
fpr_slot: Option<&StackSlot>,
isa: &dyn TargetIsa,
) {
if stack_size > 0 {
// Check if there is a special stack limit parameter. If so insert stack check.
if let Some(stack_limit_arg) = pos.func.special_param(ArgumentPurpose::StackLimit) {
// Total stack size is the size of all stack area used by the function, including
// pushed CSRs, frame pointer.
// Also, the size of a return address, implicitly pushed by a x86 `call` instruction,
// also should be accounted for.
// If any FPR are present, count them as well as necessary alignment space.
// TODO: Check if the function body actually contains a `call` instruction.
let mut total_stack_size =
(csrs.iter(GPR).len() + 1 + 1) as i64 * (isa.pointer_bytes() as isize) as i64;
total_stack_size += csrs.iter(FPR).len() as i64 * types::F64X2.bytes() as i64;
insert_stack_check(pos, total_stack_size, stack_limit_arg);
// If this is a leaf function with zero stack, then there's no need to
// insert a stack check since it can't overflow anything and
// forward-progress is guarantee so long as loop are handled anyway.
//
// If this has a stack size it could stack overflow, or if it isn't a leaf
// it could be part of a long call chain which we need to check anyway.
//
// First we look for the stack limit as a special argument to the function,
// and failing that we see if a custom stack limit factory has been provided
// which will be used to likely calculate the stack limit from the arguments
// or perhaps constants.
if stack_size > 0 || !pos.func.is_leaf() {
let scratch = ir::ValueLoc::Reg(RU::rax as RegUnit);
let stack_limit_arg = match pos.func.special_param(ArgumentPurpose::StackLimit) {
Some(arg) => {
let copy = pos.ins().copy(arg);
pos.func.locations[copy] = scratch;
Some(copy)
}
None => pos
.func
.stack_limit
.map(|gv| interpret_gv(pos, gv, scratch)),
};
if let Some(stack_limit_arg) = stack_limit_arg {
insert_stack_check(pos, stack_size, stack_limit_arg);
}
}
@@ -811,16 +822,76 @@ fn insert_common_prologue(
);
}
/// Inserts code necessary to calculate `gv`.
///
/// Note that this is typically done with `ins().global_value(...)` but that
/// requires legalization to run to encode it, and we're running super late
/// here in the backend where legalization isn't possible. To get around this
/// we manually interpret the `gv` specified and do register allocation for
/// intermediate values.
///
/// This is an incomplete implementation of loading `GlobalValue` values to get
/// compared to the stack pointer, but currently it serves enough functionality
/// to get this implemented in `wasmtime` itself. This'll likely get expanded a
/// bit over time!
fn interpret_gv(pos: &mut EncCursor, gv: ir::GlobalValue, scratch: ir::ValueLoc) -> ir::Value {
match pos.func.global_values[gv] {
ir::GlobalValueData::VMContext => pos
.func
.special_param(ir::ArgumentPurpose::VMContext)
.expect("no vmcontext parameter found"),
ir::GlobalValueData::Load {
base,
offset,
global_type,
readonly: _,
} => {
let base = interpret_gv(pos, base, scratch);
let ret = pos
.ins()
.load(global_type, ir::MemFlags::trusted(), base, offset);
pos.func.locations[ret] = scratch;
return ret;
}
ref other => panic!("global value for stack limit not supported: {}", other),
}
}
/// Insert a check that generates a trap if the stack pointer goes
/// below a value in `stack_limit_arg`.
fn insert_stack_check(pos: &mut EncCursor, stack_size: i64, stack_limit_arg: ir::Value) {
use crate::ir::condcodes::IntCC;
// Our stack pointer, after subtracting `stack_size`, must not be below
// `stack_limit_arg`. To do this we're going to add `stack_size` to
// `stack_limit_arg` and see if the stack pointer is below that. The
// `stack_size + stack_limit_arg` computation might overflow, however, due
// to how stack limits may be loaded and set externally to trigger a trap.
//
// To handle this we'll need an extra comparison to see if the stack
// pointer is already below `stack_limit_arg`. Most of the time this
// isn't necessary though since the stack limit which triggers a trap is
// likely a sentinel somewhere around `usize::max_value()`. In that case
// only conditionally emit this pre-flight check. That way most functions
// only have the one comparison, but are also guaranteed that if we add
// `stack_size` to `stack_limit_arg` is won't overflow.
//
// This does mean that code generators which use this stack check
// functionality need to ensure that values stored into the stack limit
// will never overflow if this threshold is added.
if stack_size >= 32 * 1024 {
let cflags = pos.ins().ifcmp_sp(stack_limit_arg);
pos.func.locations[cflags] = ir::ValueLoc::Reg(RU::rflags as RegUnit);
pos.ins().trapif(
IntCC::UnsignedGreaterThanOrEqual,
cflags,
ir::TrapCode::StackOverflow,
);
}
// Copy `stack_limit_arg` into a %rax and use it for calculating
// a SP threshold.
let stack_limit_copy = pos.ins().copy(stack_limit_arg);
pos.func.locations[stack_limit_copy] = ir::ValueLoc::Reg(RU::rax as RegUnit);
let sp_threshold = pos.ins().iadd_imm(stack_limit_copy, stack_size);
let sp_threshold = pos.ins().iadd_imm(stack_limit_arg, stack_size);
pos.func.locations[sp_threshold] = ir::ValueLoc::Reg(RU::rax as RegUnit);
// If the stack pointer currently reaches the SP threshold or below it then after opening