Commit Graph

23 Commits

Author SHA1 Message Date
Alex Crichton
b0939f6626 Remove explicit S type parameters (#5275)
* Remove explicit `S` type parameters

This commit removes the explicit `S` type parameter on `Func::typed` and
`Instance::get_typed_func`. Historical versions of Rust required that
this be a type parameter but recent rustcs support a mixture of explicit
type parameters and `impl Trait`. This removes, at callsites, a
superfluous `, _` argument which otherwise never needs specification.

* Fix mdbook examples
2022-11-16 05:04:26 +00:00
Alex Crichton
000bd98ae5 Merge pull request from GHSA-44mr-8vmm-wjhg
Ensure that support is not regressed in keeping this working.
2022-11-10 11:34:59 -06:00
Alex Crichton
3535acbf3b Merge pull request from GHSA-wh6w-3828-g9qf
* Unconditionally use `MemoryImageSlot`

This commit removes the internal branching within the pooling instance
allocator to sometimes use a `MemoryImageSlot` and sometimes now.
Instead this is now unconditionally used in all situations on all
platforms. This fixes an issue where the state of a slot could get
corrupted if modules being instantiated switched from having images to
not having an image or vice versa.

The bulk of this commit is the removal of the `memory-init-cow`
compile-time feature in addition to adding Windows support to the
`cow.rs` file.

* Fix compile on Unix

* Add a stricter assertion for static memory bounds

Double-check that when a memory is allocated the configuration required
is satisfied by the pooling allocator.
2022-11-10 11:34:38 -06:00
Alex Crichton
50cffad0d3 Implement support for dynamic memories in the pooling allocator (#5208)
* Implement support for dynamic memories in the pooling allocator

This is a continuation of the thrust in #5207 for reducing page faults
and lock contention when using the pooling allocator. To that end this
commit implements support for efficient memory management in the pooling
allocator when using wasm that is instrumented with bounds checks.

The `MemoryImageSlot` type now avoids unconditionally shrinking memory
back to its initial size during the `clear_and_remain_ready` operation,
instead deferring optional resizing of memory to the subsequent call to
`instantiate` when the slot is reused. The instantiation portion then
takes the "memory style" as an argument which dictates whether the
accessible memory must be precisely fit or whether it's allowed to
exceed the maximum. This in effect enables skipping a call to `mprotect`
to shrink the heap when dynamic memory checks are enabled.

In terms of page fault and contention this should improve the situation
by:

* Fewer calls to `mprotect` since once a heap grows it stays grown and
  it never shrinks. This means that a write lock is taken within the
  kernel much more rarely from before (only asymptotically now, not
  N-times-per-instance).

* Accessed memory after a heap growth operation will not fault if it was
  previously paged in by a prior instance and set to zero with `memset`.
  Unlike #5207 which requires a 6.0 kernel to see this optimization this
  commit enables the optimization for any kernel.

The major cost of choosing this strategy is naturally the performance
hit of the wasm itself. This is being looked at in PRs such as #5190 to
improve Wasmtime's story here.

This commit does not implement any new configuration options for
Wasmtime but instead reinterprets existing configuration options. The
pooling allocator no longer unconditionally sets
`static_memory_bound_is_maximum` and then implements support necessary
for this memory type. This other change to this commit is that the
`Tunables::static_memory_bound` configuration option is no longer gating
on the creation of a `MemoryPool` and it will now appropriately size to
`instance_limits.memory_pages` if the `static_memory_bound` is to small.
This is done to accomodate fuzzing more easily where the
`static_memory_bound` will become small during fuzzing and otherwise the
configuration would be rejected and require manual handling. The spirit
of the `MemoryPool` is one of large virtual address space reservations
anyway so it seemed reasonable to interpret the configuration this way.

* Skip zero memory_size cases

These are causing errors to happen when fuzzing and otherwise in theory
shouldn't be too interesting to optimize for anyway since they likely
aren't used in practice.
2022-11-08 14:43:08 -06:00
Alex Crichton
b14551d7ca Refactor configuration for the pooling allocator (#5205)
This commit changes the APIs in the `wasmtime` crate for configuring the
pooling allocator. I plan on adding a few more configuration options in
the near future and the current structure was feeling unwieldy for
adding these new abstractions.

The previous `struct`-based API has been replaced with a builder-style
API in a similar shape as to `Config`. This is done to help make it
easier to add more configuration options in the future through adding
more methods as opposed to adding more field which could break prior
initializations.
2022-11-04 20:06:45 +00:00
Alex Crichton
2afaac5181 Return anyhow::Error from host functions instead of Trap, redesign Trap (#5149)
* Return `anyhow::Error` from host functions instead of `Trap`

This commit refactors how errors are modeled when returned from host
functions and additionally refactors how custom errors work with `Trap`.
At a high level functions in Wasmtime that previously worked with
`Result<T, Trap>` now work with `Result<T>` instead where the error is
`anyhow::Error`. This includes functions such as:

* Host-defined functions in a `Linker<T>`
* `TypedFunc::call`
* Host-related callbacks like call hooks

Errors are now modeled primarily as `anyhow::Error` throughout Wasmtime.
This subsequently removes the need for `Trap` to have the ability to
represent all host-defined errors as it previously did. Consequently the
`From` implementations for any error into a `Trap` have been removed
here and the only embedder-defined way to create a `Trap` is to use
`Trap::new` with a custom string.

After this commit the distinction between a `Trap` and a host error is
the wasm backtrace that it contains. Previously all errors in host
functions would flow through a `Trap` and get a wasm backtrace attached
to them, but now this only happens if a `Trap` itself is created meaning
that arbitrary host-defined errors flowing from a host import to the
other side won't get backtraces attached. Some internals of Wasmtime
itself were updated or preserved to use `Trap::new` to capture a
backtrace where it seemed useful, such as when fuel runs out.

The main motivation for this commit is that it now enables hosts to
thread a concrete error type from a host function all the way through to
where a wasm function was invoked. Previously this could not be done
since the host error was wrapped in a `Trap` that didn't provide the
ability to get at the internals.

A consequence of this commit is that when a host error is returned that
isn't a `Trap` we'll capture a backtrace and then won't have a `Trap` to
attach it to. To avoid losing the contextual information this commit
uses the `Error::context` method to attach the backtrace as contextual
information to ensure that the backtrace is itself not lost.

This is a breaking change for likely all users of Wasmtime, but it's
hoped to be a relatively minor change to workaround. Most use cases can
likely change `-> Result<T, Trap>` to `-> Result<T>` and otherwise
explicit creation of a `Trap` is largely no longer necessary.

* Fix some doc links

* add some tests and make a backtrace type public (#55)

* Trap: avoid a trailing newline in the Display impl

which in turn ends up with three newlines between the end of the
backtrace and the `Caused by` in the anyhow Debug impl

* make BacktraceContext pub, and add tests showing downcasting behavior of anyhow::Error to traps or backtraces

* Remove now-unnecesary `Trap` downcasts in `Linker::module`

* Fix test output expectations

* Remove `Trap::i32_exit`

This commit removes special-handling in the `wasmtime::Trap` type for
the i32 exit code required by WASI. This is now instead modeled as a
specific `I32Exit` error type in the `wasmtime-wasi` crate which is
returned by the `proc_exit` hostcall. Embedders which previously tested
for i32 exits now downcast to the `I32Exit` value.

* Remove the `Trap::new` constructor

This commit removes the ability to create a trap with an arbitrary error
message. The purpose of this commit is to continue the prior trend of
leaning into the `anyhow::Error` type instead of trying to recreate it
with `Trap`. A subsequent simplification to `Trap` after this commit is
that `Trap` will simply be an `enum` of trap codes with no extra
information. This commit is doubly-motivated by the desire to always use
the new `BacktraceContext` type instead of sometimes using that and
sometimes using `Trap`.

Most of the changes here were around updating `Trap::new` calls to
`bail!` calls instead. Tests which assert particular error messages
additionally often needed to use the `:?` formatter instead of the `{}`
formatter because the prior formats the whole `anyhow::Error` and the
latter only formats the top-most error, which now contains the
backtrace.

* Merge `Trap` and `TrapCode`

With prior refactorings there's no more need for `Trap` to be opaque or
otherwise contain a backtrace. This commit parse down `Trap` to simply
an `enum` which was the old `TrapCode`. All various tests and such were
updated to handle this.

The main consequence of this commit is that all errors have a
`BacktraceContext` context attached to them. This unfortunately means
that the backtrace is printed first before the error message or trap
code, but given all the prior simplifications that seems worth it at
this time.

* Rename `BacktraceContext` to `WasmBacktrace`

This feels like a better name given how this has turned out, and
additionally this commit removes having both `WasmBacktrace` and
`BacktraceContext`.

* Soup up documentation for errors and traps

* Fix build of the C API

Co-authored-by: Pat Hickey <pat@moreproductive.org>
2022-11-02 16:29:31 +00:00
Nick Fitzgerald
46782b18c2 wasmtime: Implement fast Wasm stack walking (#4431)
* Always preserve frame pointers in Wasmtime

This allows us to efficiently and simply capture Wasm stacks without maintaining
and synchronizing any safety-critical side tables between the compiler and the
runtime.

* wasmtime: Implement fast Wasm stack walking

Why do we want Wasm stack walking to be fast? Because we capture stacks whenever
there is a trap and traps actually happen fairly frequently with short-lived
programs and WASI's `exit`.

Previously, we would rely on generating the system unwind info (e.g.
`.eh_frame`) and using the system unwinder (via the `backtrace`crate) to walk
the full stack and filter out any non-Wasm stack frames. This can,
unfortunately, be slow for two primary reasons:

1. The system unwinder is doing `O(all-kinds-of-frames)` work rather than
`O(wasm-frames)` work.

2. System unwind info and the system unwinder need to be much more general than
a purpose-built stack walker for Wasm needs to be. It has to handle any kind of
stack frame that any compiler might emit where as our Wasm frames are emitted by
Cranelift and always have frame pointers. This translates into implementation
complexity and general overhead. There can also be unnecessary-for-our-use-cases
global synchronization and locks involved, further slowing down stack walking in
the presence of multiple threads trying to capture stacks in parallel.

This commit introduces a purpose-built stack walker for traversing just our Wasm
frames. To find all the sequences of Wasm-to-Wasm stack frames, and ignore
non-Wasm stack frames, we keep a linked list of `(entry stack pointer, exit
frame pointer)` pairs. This linked list is maintained via Wasm-to-host and
host-to-Wasm trampolines. Within a sequence of Wasm-to-Wasm calls, we can use
frame pointers (which Cranelift preserves) to find the next older Wasm frame on
the stack, and we keep doing this until we reach the entry stack pointer,
meaning that the next older frame will be a host frame.

The trampolines need to avoid a couple stumbling blocks. First, they need to be
compiled ahead of time, since we may not have access to a compiler at
runtime (e.g. if the `cranelift` feature is disabled) but still want to be able
to call functions that have already been compiled and get stack traces for those
functions. Usually this means we would compile the appropriate trampolines
inside `Module::new` and the compiled module object would hold the
trampolines. However, we *also* need to support calling host functions that are
wrapped into `wasmtime::Func`s and there doesn't exist *any* ahead-of-time
compiled module object to hold the appropriate trampolines:

```rust
// Define a host function.
let func_type = wasmtime::FuncType::new(
    vec![wasmtime::ValType::I32],
    vec![wasmtime::ValType::I32],
);
let func = Func::new(&mut store, func_type, |_, params, results| {
    // ...
    Ok(())
});

// Call that host function.
let mut results = vec![wasmtime::Val::I32(0)];
func.call(&[wasmtime::Val::I32(0)], &mut results)?;
```

Therefore, we define one host-to-Wasm trampoline and one Wasm-to-host trampoline
in assembly that work for all Wasm and host function signatures. These
trampolines are careful to only use volatile registers, avoid touching any
register that is an argument in the calling convention ABI, and tail call to the
target callee function. This allows forwarding any set of arguments and any
returns to and from the callee, while also allowing us to maintain our linked
list of Wasm stack and frame pointers before transferring control to the
callee. These trampolines are not used in Wasm-to-Wasm calls, only when crossing
the host-Wasm boundary, so they do not impose overhead on regular calls. (And if
using one trampoline for all host-Wasm boundary crossing ever breaks branch
prediction enough in the CPU to become any kind of bottleneck, we can do fun
things like have multiple copies of the same trampoline and choose a random copy
for each function, sharding the functions across branch predictor entries.)

Finally, this commit also ends the use of a synthetic `Module` and allocating a
stubbed out `VMContext` for host functions. Instead, we define a
`VMHostFuncContext` with its own magic value, similar to `VMComponentContext`,
specifically for host functions.

<h2>Benchmarks</h2>

<h3>Traps and Stack Traces</h3>

Large improvements to taking stack traces on traps, ranging from shaving off 64%
to 99.95% of the time it used to take.

<details>

```
multi-threaded-traps/0  time:   [2.5686 us 2.5808 us 2.5934 us]
                        thrpt:  [0.0000  elem/s 0.0000  elem/s 0.0000  elem/s]
                 change:
                        time:   [-85.419% -85.153% -84.869%] (p = 0.00 < 0.05)
                        thrpt:  [+560.90% +573.56% +585.84%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe
multi-threaded-traps/1  time:   [2.9021 us 2.9167 us 2.9322 us]
                        thrpt:  [341.04 Kelem/s 342.86 Kelem/s 344.58 Kelem/s]
                 change:
                        time:   [-91.455% -91.294% -91.096%] (p = 0.00 < 0.05)
                        thrpt:  [+1023.1% +1048.6% +1070.3%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe
multi-threaded-traps/2  time:   [2.9996 us 3.0145 us 3.0295 us]
                        thrpt:  [660.18 Kelem/s 663.47 Kelem/s 666.76 Kelem/s]
                 change:
                        time:   [-94.040% -93.910% -93.762%] (p = 0.00 < 0.05)
                        thrpt:  [+1503.1% +1542.0% +1578.0%]
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high severe
multi-threaded-traps/4  time:   [5.5768 us 5.6052 us 5.6364 us]
                        thrpt:  [709.68 Kelem/s 713.63 Kelem/s 717.25 Kelem/s]
                 change:
                        time:   [-93.193% -93.121% -93.052%] (p = 0.00 < 0.05)
                        thrpt:  [+1339.2% +1353.6% +1369.1%]
                        Performance has improved.
multi-threaded-traps/8  time:   [8.6408 us 9.1212 us 9.5438 us]
                        thrpt:  [838.24 Kelem/s 877.08 Kelem/s 925.84 Kelem/s]
                 change:
                        time:   [-94.754% -94.473% -94.202%] (p = 0.00 < 0.05)
                        thrpt:  [+1624.7% +1709.2% +1806.1%]
                        Performance has improved.
multi-threaded-traps/16 time:   [10.152 us 10.840 us 11.545 us]
                        thrpt:  [1.3858 Melem/s 1.4760 Melem/s 1.5761 Melem/s]
                 change:
                        time:   [-97.042% -96.823% -96.577%] (p = 0.00 < 0.05)
                        thrpt:  [+2821.5% +3048.1% +3281.1%]
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

many-modules-registered-traps/1
                        time:   [2.6278 us 2.6361 us 2.6447 us]
                        thrpt:  [378.11 Kelem/s 379.35 Kelem/s 380.55 Kelem/s]
                 change:
                        time:   [-85.311% -85.108% -84.909%] (p = 0.00 < 0.05)
                        thrpt:  [+562.65% +571.51% +580.76%]
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
many-modules-registered-traps/8
                        time:   [2.6294 us 2.6460 us 2.6623 us]
                        thrpt:  [3.0049 Melem/s 3.0235 Melem/s 3.0425 Melem/s]
                 change:
                        time:   [-85.895% -85.485% -85.022%] (p = 0.00 < 0.05)
                        thrpt:  [+567.63% +588.95% +608.95%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
many-modules-registered-traps/64
                        time:   [2.6218 us 2.6329 us 2.6452 us]
                        thrpt:  [24.195 Melem/s 24.308 Melem/s 24.411 Melem/s]
                 change:
                        time:   [-93.629% -93.551% -93.470%] (p = 0.00 < 0.05)
                        thrpt:  [+1431.4% +1450.6% +1469.5%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
many-modules-registered-traps/512
                        time:   [2.6569 us 2.6737 us 2.6923 us]
                        thrpt:  [190.17 Melem/s 191.50 Melem/s 192.71 Melem/s]
                 change:
                        time:   [-99.277% -99.268% -99.260%] (p = 0.00 < 0.05)
                        thrpt:  [+13417% +13566% +13731%]
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
many-modules-registered-traps/4096
                        time:   [2.7258 us 2.7390 us 2.7535 us]
                        thrpt:  [1.4876 Gelem/s 1.4955 Gelem/s 1.5027 Gelem/s]
                 change:
                        time:   [-99.956% -99.955% -99.955%] (p = 0.00 < 0.05)
                        thrpt:  [+221417% +223380% +224881%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

many-stack-frames-traps/1
                        time:   [1.4658 us 1.4719 us 1.4784 us]
                        thrpt:  [676.39 Kelem/s 679.38 Kelem/s 682.21 Kelem/s]
                 change:
                        time:   [-90.368% -89.947% -89.586%] (p = 0.00 < 0.05)
                        thrpt:  [+860.23% +894.72% +938.21%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
many-stack-frames-traps/8
                        time:   [2.4772 us 2.4870 us 2.4973 us]
                        thrpt:  [3.2034 Melem/s 3.2167 Melem/s 3.2294 Melem/s]
                 change:
                        time:   [-85.550% -85.370% -85.199%] (p = 0.00 < 0.05)
                        thrpt:  [+575.65% +583.51% +592.03%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe
many-stack-frames-traps/64
                        time:   [10.109 us 10.171 us 10.236 us]
                        thrpt:  [6.2525 Melem/s 6.2925 Melem/s 6.3309 Melem/s]
                 change:
                        time:   [-78.144% -77.797% -77.336%] (p = 0.00 < 0.05)
                        thrpt:  [+341.22% +350.38% +357.55%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe
many-stack-frames-traps/512
                        time:   [126.16 us 126.54 us 126.96 us]
                        thrpt:  [4.0329 Melem/s 4.0461 Melem/s 4.0583 Melem/s]
                 change:
                        time:   [-65.364% -64.933% -64.453%] (p = 0.00 < 0.05)
                        thrpt:  [+181.32% +185.17% +188.71%]
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high severe
```

</details>

<h3>Calls</h3>

There is, however, a small regression in raw Wasm-to-host and host-to-Wasm call
performance due the new trampolines. It seems to be on the order of about 2-10
nanoseconds per call, depending on the benchmark.

I believe this regression is ultimately acceptable because

1. this overhead will be vastly dominated by whatever work a non-nop callee
actually does,

2. we will need these trampolines, or something like them, when implementing the
Wasm exceptions proposal to do things like translate Wasm's exceptions into
Rust's `Result`s,

3. and because the performance improvements to trapping and capturing stack
traces are of such a larger magnitude than this call regressions.

<details>

```
sync/no-hook/host-to-wasm - typed - nop
                        time:   [28.683 ns 28.757 ns 28.844 ns]
                        change: [+16.472% +17.183% +17.904%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  5 (5.00%) high severe
sync/no-hook/host-to-wasm - untyped - nop
                        time:   [42.515 ns 42.652 ns 42.841 ns]
                        change: [+12.371% +14.614% +17.462%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) high mild
  10 (10.00%) high severe
sync/no-hook/host-to-wasm - unchecked - nop
                        time:   [33.936 ns 34.052 ns 34.179 ns]
                        change: [+25.478% +26.938% +28.369%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe
sync/no-hook/host-to-wasm - typed - nop-params-and-results
                        time:   [34.290 ns 34.388 ns 34.502 ns]
                        change: [+40.802% +42.706% +44.526%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe
sync/no-hook/host-to-wasm - untyped - nop-params-and-results
                        time:   [62.546 ns 62.721 ns 62.919 ns]
                        change: [+2.5014% +3.6319% +4.8078%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe
sync/no-hook/host-to-wasm - unchecked - nop-params-and-results
                        time:   [42.609 ns 42.710 ns 42.831 ns]
                        change: [+20.966% +22.282% +23.475%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe

sync/hook-sync/host-to-wasm - typed - nop
                        time:   [29.546 ns 29.675 ns 29.818 ns]
                        change: [+20.693% +21.794% +22.836%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
sync/hook-sync/host-to-wasm - untyped - nop
                        time:   [45.448 ns 45.699 ns 45.961 ns]
                        change: [+17.204% +18.514% +19.590%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe
sync/hook-sync/host-to-wasm - unchecked - nop
                        time:   [34.334 ns 34.437 ns 34.558 ns]
                        change: [+23.225% +24.477% +25.886%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
sync/hook-sync/host-to-wasm - typed - nop-params-and-results
                        time:   [36.594 ns 36.763 ns 36.974 ns]
                        change: [+41.967% +47.261% +52.086%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe
sync/hook-sync/host-to-wasm - untyped - nop-params-and-results
                        time:   [63.541 ns 63.831 ns 64.194 ns]
                        change: [-4.4337% -0.6855% +2.7134%] (p = 0.73 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
sync/hook-sync/host-to-wasm - unchecked - nop-params-and-results
                        time:   [43.968 ns 44.169 ns 44.437 ns]
                        change: [+18.772% +21.802% +24.623%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) high mild
  12 (12.00%) high severe

async/no-hook/host-to-wasm - typed - nop
                        time:   [4.9612 us 4.9743 us 4.9889 us]
                        change: [+9.9493% +11.911% +13.502%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe
async/no-hook/host-to-wasm - untyped - nop
                        time:   [5.0030 us 5.0211 us 5.0439 us]
                        change: [+10.841% +11.873% +12.977%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe
async/no-hook/host-to-wasm - typed - nop-params-and-results
                        time:   [4.9273 us 4.9468 us 4.9700 us]
                        change: [+4.7381% +6.8445% +8.8238%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) high mild
  9 (9.00%) high severe
async/no-hook/host-to-wasm - untyped - nop-params-and-results
                        time:   [5.1151 us 5.1338 us 5.1555 us]
                        change: [+9.5335% +11.290% +13.044%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe

async/hook-sync/host-to-wasm - typed - nop
                        time:   [4.9330 us 4.9394 us 4.9467 us]
                        change: [+10.046% +11.038% +12.035%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
async/hook-sync/host-to-wasm - untyped - nop
                        time:   [5.0073 us 5.0183 us 5.0310 us]
                        change: [+9.3828% +10.565% +11.752%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
async/hook-sync/host-to-wasm - typed - nop-params-and-results
                        time:   [4.9610 us 4.9839 us 5.0097 us]
                        change: [+9.0857% +11.513% +14.359%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe
async/hook-sync/host-to-wasm - untyped - nop-params-and-results
                        time:   [5.0995 us 5.1272 us 5.1617 us]
                        change: [+9.3600% +11.506% +13.809%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe

async-pool/no-hook/host-to-wasm - typed - nop
                        time:   [2.4242 us 2.4316 us 2.4396 us]
                        change: [+7.8756% +8.8803% +9.8346%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
async-pool/no-hook/host-to-wasm - untyped - nop
                        time:   [2.5102 us 2.5155 us 2.5210 us]
                        change: [+12.130% +13.194% +14.270%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe
async-pool/no-hook/host-to-wasm - typed - nop-params-and-results
                        time:   [2.4203 us 2.4310 us 2.4440 us]
                        change: [+4.0380% +6.3623% +8.7534%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) high mild
  9 (9.00%) high severe
async-pool/no-hook/host-to-wasm - untyped - nop-params-and-results
                        time:   [2.5501 us 2.5593 us 2.5700 us]
                        change: [+8.8802% +10.976% +12.937%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  5 (5.00%) high mild
  11 (11.00%) high severe

async-pool/hook-sync/host-to-wasm - typed - nop
                        time:   [2.4135 us 2.4190 us 2.4254 us]
                        change: [+8.3640% +9.3774% +10.435%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe
async-pool/hook-sync/host-to-wasm - untyped - nop
                        time:   [2.5172 us 2.5248 us 2.5357 us]
                        change: [+11.543% +12.750% +13.982%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) high mild
  7 (7.00%) high severe
async-pool/hook-sync/host-to-wasm - typed - nop-params-and-results
                        time:   [2.4214 us 2.4353 us 2.4532 us]
                        change: [+1.5158% +5.0872% +8.6765%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) high mild
  13 (13.00%) high severe
async-pool/hook-sync/host-to-wasm - untyped - nop-params-and-results
                        time:   [2.5499 us 2.5607 us 2.5748 us]
                        change: [+10.146% +12.459% +14.919%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  3 (3.00%) high mild
  15 (15.00%) high severe

sync/no-hook/wasm-to-host - nop - typed
                        time:   [6.6135 ns 6.6288 ns 6.6452 ns]
                        change: [+37.927% +38.837% +39.869%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - typed
                        time:   [15.930 ns 15.993 ns 16.067 ns]
                        change: [+3.9583% +5.6286% +7.2430%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  11 (11.00%) high mild
  1 (1.00%) high severe
sync/no-hook/wasm-to-host - nop - untyped
                        time:   [20.596 ns 20.640 ns 20.690 ns]
                        change: [+4.3293% +5.2047% +6.0935%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - untyped
                        time:   [42.659 ns 42.882 ns 43.159 ns]
                        change: [-2.1466% -0.5079% +1.2554%] (p = 0.58 > 0.05)
                        No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
  1 (1.00%) high mild
  14 (14.00%) high severe
sync/no-hook/wasm-to-host - nop - unchecked
                        time:   [10.671 ns 10.691 ns 10.713 ns]
                        change: [+83.911% +87.620% +92.062%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.136 ns 11.190 ns 11.263 ns]
                        change: [-29.719% -28.446% -27.029%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe

sync/hook-sync/wasm-to-host - nop - typed
                        time:   [6.7964 ns 6.8087 ns 6.8226 ns]
                        change: [+21.531% +24.206% +27.331%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - typed
                        time:   [15.865 ns 15.921 ns 15.985 ns]
                        change: [+4.8466% +6.3330% +7.8317%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe
sync/hook-sync/wasm-to-host - nop - untyped
                        time:   [21.505 ns 21.587 ns 21.677 ns]
                        change: [+8.0908% +9.1943% +10.254%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - untyped
                        time:   [44.018 ns 44.128 ns 44.261 ns]
                        change: [-1.4671% -0.0458% +1.2443%] (p = 0.94 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) high mild
  9 (9.00%) high severe
sync/hook-sync/wasm-to-host - nop - unchecked
                        time:   [11.264 ns 11.326 ns 11.387 ns]
                        change: [+80.225% +81.659% +83.068%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.816 ns 11.865 ns 11.920 ns]
                        change: [-29.152% -28.040% -26.957%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  8 (8.00%) high mild
  6 (6.00%) high severe

async/no-hook/wasm-to-host - nop - typed
                        time:   [6.6221 ns 6.6385 ns 6.6569 ns]
                        change: [+43.618% +44.755% +45.965%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - typed
                        time:   [15.884 ns 15.929 ns 15.983 ns]
                        change: [+3.5987% +5.2053% +6.7846%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe
async/no-hook/wasm-to-host - nop - untyped
                        time:   [20.615 ns 20.702 ns 20.821 ns]
                        change: [+6.9799% +8.1212% +9.2819%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) high mild
  8 (8.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - untyped
                        time:   [41.956 ns 42.207 ns 42.521 ns]
                        change: [-4.3057% -2.7730% -1.2428%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) high mild
  11 (11.00%) high severe
async/no-hook/wasm-to-host - nop - unchecked
                        time:   [10.440 ns 10.474 ns 10.513 ns]
                        change: [+83.959% +85.826% +87.541%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.476 ns 11.512 ns 11.554 ns]
                        change: [-29.857% -28.383% -26.978%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  5 (5.00%) high severe
async/no-hook/wasm-to-host - nop - async-typed
                        time:   [26.427 ns 26.478 ns 26.532 ns]
                        change: [+6.5730% +7.4676% +8.3983%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - async-typed
                        time:   [28.557 ns 28.693 ns 28.880 ns]
                        change: [+1.9099% +3.7332% +5.9731%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  1 (1.00%) high mild
  14 (14.00%) high severe

async/hook-sync/wasm-to-host - nop - typed
                        time:   [6.7488 ns 6.7630 ns 6.7784 ns]
                        change: [+19.935% +22.080% +23.683%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - typed
                        time:   [15.928 ns 16.031 ns 16.149 ns]
                        change: [+5.5188% +6.9567% +8.3839%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  9 (9.00%) high mild
  2 (2.00%) high severe
async/hook-sync/wasm-to-host - nop - untyped
                        time:   [21.930 ns 22.114 ns 22.296 ns]
                        change: [+4.6674% +7.7588% +10.375%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - untyped
                        time:   [42.684 ns 42.858 ns 43.081 ns]
                        change: [-5.2957% -3.4693% -1.6217%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  2 (2.00%) high mild
  12 (12.00%) high severe
async/hook-sync/wasm-to-host - nop - unchecked
                        time:   [11.026 ns 11.053 ns 11.086 ns]
                        change: [+70.751% +72.378% +73.961%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.840 ns 11.900 ns 11.982 ns]
                        change: [-27.977% -26.584% -24.887%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  3 (3.00%) high mild
  15 (15.00%) high severe
async/hook-sync/wasm-to-host - nop - async-typed
                        time:   [27.601 ns 27.709 ns 27.882 ns]
                        change: [+8.1781% +9.1102% +10.030%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  6 (6.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - async-typed
                        time:   [28.955 ns 29.174 ns 29.413 ns]
                        change: [+1.1226% +3.0366% +5.1126%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe

async-pool/no-hook/wasm-to-host - nop - typed
                        time:   [6.5626 ns 6.5733 ns 6.5851 ns]
                        change: [+40.561% +42.307% +44.514%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - typed
                        time:   [15.820 ns 15.886 ns 15.969 ns]
                        change: [+4.1044% +5.7928% +7.7122%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
  4 (4.00%) high mild
  13 (13.00%) high severe
async-pool/no-hook/wasm-to-host - nop - untyped
                        time:   [20.481 ns 20.521 ns 20.566 ns]
                        change: [+6.7962% +7.6950% +8.7612%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - untyped
                        time:   [41.834 ns 41.998 ns 42.189 ns]
                        change: [-3.8185% -2.2687% -0.7541%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe
async-pool/no-hook/wasm-to-host - nop - unchecked
                        time:   [10.353 ns 10.380 ns 10.414 ns]
                        change: [+82.042% +84.591% +87.205%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.123 ns 11.168 ns 11.228 ns]
                        change: [-30.813% -29.285% -27.874%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  11 (11.00%) high mild
  1 (1.00%) high severe
async-pool/no-hook/wasm-to-host - nop - async-typed
                        time:   [27.442 ns 27.528 ns 27.638 ns]
                        change: [+7.5215% +9.9795% +12.266%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  3 (3.00%) high mild
  15 (15.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - async-typed
                        time:   [29.014 ns 29.148 ns 29.312 ns]
                        change: [+2.0227% +3.4722% +4.9047%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

async-pool/hook-sync/wasm-to-host - nop - typed
                        time:   [6.7916 ns 6.8116 ns 6.8325 ns]
                        change: [+20.937% +22.050% +23.281%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - typed
                        time:   [15.917 ns 15.975 ns 16.051 ns]
                        change: [+4.6404% +6.4217% +8.3075%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  5 (5.00%) high mild
  11 (11.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - untyped
                        time:   [21.558 ns 21.612 ns 21.679 ns]
                        change: [+8.1158% +9.1409% +10.217%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - untyped
                        time:   [42.475 ns 42.614 ns 42.775 ns]
                        change: [-6.3613% -4.4709% -2.7647%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  3 (3.00%) high mild
  15 (15.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - unchecked
                        time:   [11.150 ns 11.195 ns 11.247 ns]
                        change: [+74.424% +77.056% +79.811%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) high mild
  11 (11.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - unchecked
                        time:   [11.639 ns 11.695 ns 11.760 ns]
                        change: [-30.212% -29.023% -27.954%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  7 (7.00%) high mild
  8 (8.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - async-typed
                        time:   [27.480 ns 27.712 ns 27.984 ns]
                        change: [+2.9764% +6.5061% +9.8914%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - async-typed
                        time:   [29.218 ns 29.380 ns 29.600 ns]
                        change: [+5.2283% +7.7247% +10.822%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  2 (2.00%) high mild
  14 (14.00%) high severe
```

</details>

* Add s390x support for frame pointer-based stack walking

* wasmtime: Allow `Caller::get_export` to get all exports

* fuzzing: Add a fuzz target to check that our stack traces are correct

We generate Wasm modules that keep track of their own stack as they call and
return between functions, and then we periodically check that if the host
captures a backtrace, it matches what the Wasm module has recorded.

* Remove VM offsets for `VMHostFuncContext` since it isn't used by JIT code

* Add doc comment with stack walking implementation notes

* Document the extra state that can be passed to `wasmtime_runtime::Backtrace` methods

* Add extensive comments for stack walking function

* Factor architecture-specific bits of stack walking out into modules

* Initialize store-related fields in a vmctx to null when there is no store yet

Rather than leaving them as uninitialized data.

* Use `set_callee` instead of manually setting the vmctx field

* Use a more informative compile error message for unsupported architectures

* Document unsafety of `prepare_host_to_wasm_trampoline`

* Use `bti c` instead of `hint #34` in inline aarch64 assembly

* Remove outdated TODO comment

* Remove setting of `last_wasm_exit_fp` in `set_jit_trap`

This is no longer needed as the value is plumbed through to the backtrace code
directly now.

* Only set the stack limit once, in the face of re-entrancy into Wasm

* Add comments for s390x-specific stack walking bits

* Use the helper macro for all libcalls

If we forget to use it, and then trigger a GC from the libcall, that means we
could miss stack frames when walking the stack, fail to find live GC refs, and
then get use after free bugs. Much less risky to always use the helper macro
that takes care of all of that for us.

* Use the `asm_sym!` macro in Wasm-to-libcall trampolines

This macro handles the macOS-specific underscore prefix stuff for us.

* wasmtime: add size and align to `externref` assertion error message

* Extend the `stacks` fuzzer to have host frames in between Wasm frames

This way we get one or more contiguous sequences of Wasm frames on the stack,
instead of exactly one.

* Add documentation for aarch64-specific backtrace helpers

* Clarify that we only support little-endian aarch64 in trampoline comment

* Use `.machine z13` in s390x assembly file

Since apparently our CI machines have pretty old assemblers that don't have
`.machine z14`. This should be fine though since these trampolines don't make
use of anything that is introduced in z14.

* Fix aarch64 build

* Fix macOS build

* Document the `asm_sym!` macro

* Add windows support to the `wasmtime-asm-macros` crate

* Add windows support to host<--->Wasm trampolines

* Fix trap handler build on windows

* Run `rustfmt` on s390x trampoline source file

* Temporarily disable some assertions about a trap's backtrace in the component model tests

Follow up to re-enable this and fix the associated issue:
https://github.com/bytecodealliance/wasmtime/issues/4535

* Refactor libcall definitions with less macros

This refactors the `libcall!` macro to use the
`foreach_builtin_function!` macro to define all of the trampolines.
Additionally the macro surrounding each libcall itself is no longer
necessary and helps avoid too many macros.

* Use `VMOpaqueContext::from_vm_host_func_context` in `VMHostFuncContext::new`

* Move `backtrace` module to be submodule of `traphandlers`

This avoids making some things `pub(crate)` in `traphandlers` that really
shouldn't be.

* Fix macOS aarch64 build

* Use "i64" instead of "word" in aarch64-specific file

* Save/restore entry SP and exit FP/return pointer in the face of panicking imported host functions

Also clean up assertions surrounding our saved entry/exit registers.

* Put "typed" vs "untyped" in the same position of call benchmark names

Regardless if we are doing wasm-to-host or host-to-wasm

* Fix stacks test case generator build for new `wasm-encoder`

* Fix build for s390x

* Expand libcalls in s390x asm

* Disable more parts of component tests now that backtrace assertions are a bit tighter

* Remove assertion that can maybe fail on s390x

Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-07-28 15:46:14 -07:00
Andrew Brown
2b52f47b83 Add shared memories (#4187)
* Add shared memories

This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated as `shared`
in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are
protected from concurrent access during `memory.size` and `memory.grow`.

[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md

In order to implement this in Wasmtime, there are two main cases to
cover:
    - a program may simply create a shared memory and possibly export it;
    this means that Wasmtime itself must be able to create shared
    memories
    - a user may create a shared memory externally and pass it in as an
    import during instantiation; this is the case when the program
    contains code like `(import "env" "memory" (memory 1 1
    shared))`--this case is handled by a new Wasmtime API
    type--`SharedMemory`

Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
`Memory` runtime container will lock appropriately during `memory.size`
and `memory.grow` operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.

The second case is covered by the new `SharedMemory` structure. It uses
the same `Mmap` allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an `Arc`). The pointer address to
the actual memory is carefully wired through and owned by the
`SharedMemory` structure itself. This means that there are differing
views of where to access the pointer (i.e., `VMMemoryDefinition`): for
owned memories (the default), the `VMMemoryDefinition` is stored
directly by the `VMContext`; in the `SharedMemory` case, however, this
`VMContext` must point to this separate structure.

To ensure that the `VMContext` can always point to the correct
`VMMemoryDefinition`, this change alters the `VMContext` structure.
Since a `SharedMemory` owns its own `VMMemoryDefinition`, the
`defined_memories` table in the `VMContext` becomes a sequence of
pointers--in the shared memory case, they point to the
`VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory
case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a
new table, `owned_memories`.

This change adds an additional indirection (through the `*mut
VMMemoryDefinition` pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
`pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the `owned_memories` entry) for `memory.size` so that only
shared memories (and imported memories, as before) incur the indirection
cost.

* review: remove thread feature check

* review: swap wasmtime-types dependency for existing wasmtime-environ use

* review: remove unused VMMemoryUnion

* review: reword cross-engine error message

* review: improve tests

* review: refactor to separate prevent Memory <-> SharedMemory conversion

* review: into_shared_memory -> as_shared_memory

* review: remove commented out code

* review: limit shared min/max to 32 bits

* review: skip imported memories

* review: imported memories are not owned

* review: remove TODO

* review: document unsafe send + sync

* review: add limiter assertion

* review: remove TODO

* review: improve tests

* review: fix doc test

* fix: fixes based on discussion with Alex

This changes several key parts:
 - adds memory indexes to imports and exports
 - makes `VMMemoryDefinition::current_length` an atomic usize

* review: add `Extern::SharedMemory`

* review: remove TODO

* review: atomically load from VMMemoryDescription in JIT-generated code

* review: add test probing the last available memory slot across threads

* fix: move assertion to new location due to rebase

* fix: doc link

* fix: add TODOs to c-api

* fix: broken doc link

* fix: modify pooling allocator messages in tests

* review: make owned_memory_index panic instead of returning an option

* review: clarify calculation of num_owned_memories

* review: move 'use' to top of file

* review: change '*const [u8]' to '*mut [u8]'

* review: remove TODO

* review: avoid hard-coding memory index

* review: remove 'preallocation' parameter from 'Memory::_new'

* fix: component model memory length

* review: check that shared memory plans are static

* review: ignore growth limits for shared memory

* review: improve atomic store comment

* review: add FIXME for memory growth failure

* review: add comment about absence of bounds-checked 'memory.size'

* review: make 'current_length()' doc comment more precise

* review: more comments related to memory.size non-determinism

* review: make 'vmmemory' unreachable for shared memory

* review: move code around

* review: thread plan through to 'wrap()'

* review: disallow shared memory allocation with the pooling allocator
2022-06-08 12:13:40 -05:00
Alex Crichton
9f5f978baa Fix double-counting imports in VMOffsets calculations (#4209)
* Fix double-counting imports in `VMOffsets` calculations

This fixes an oversight in the initial creation of `VMOffsets` for a
module to avoid double-counting imported globals, tables, and memories
for calculating the size of the `VMContext`. Prior to this PR imported
items are accidentally also counted as defined items for sizing
calculations meaning that when a memory is imported but not defined, for
example, the `VMContext` will have a space for an inline
`VMMemoryDefinition` when it doesn't need to.

Auditing where all this relates to it appears that the only issue from
this mistake is that `VMContext` is a bit larger than it would otherwise
need to be. Extra slots are uninitialized memory but nothing in Wasmtime
ever actually accesses the memory either, so it should be harmless to
have extra space here. Nevertheless it seems better to shrink the size
as much as possible to avoid wasting space where we can.

* Fix tests
2022-06-02 13:39:38 -05:00
Alex Crichton
2a4851ad2b Change some VMContext pointers to () pointers (#4190)
* Change some `VMContext` pointers to `()` pointers

This commit is motivated by my work on the component model
implementation for imported functions. Currently all context pointers in
wasm are `*mut VMContext` but with the component model my plan is to
make some pointers instead along the lines of `*mut VMComponentContext`.
In doing this though one worry I have is breaking what has otherwise
been a core invariant of Wasmtime for quite some time, subtly
introducing bugs by accident.

To help assuage my worry I've opted here to erase knowledge of
`*mut VMContext` where possible. Instead where applicable a context
pointer is simply known as `*mut ()` and the embedder doesn't actually
know anything about this context beyond the value of the pointer. This
will help prevent Wasmtime from accidentally ever trying to interpret
this context pointer as an actual `VMContext` when it might instead be a
`VMComponentContext`.

Overall this was a pretty smooth transition. The main change here is
that the `VMTrampoline` (now sporting more docs) has its first argument
changed to `*mut ()`. The second argument, the caller context, is still
configured as `*mut VMContext` though because all functions are always
called from wasm still. Eventually for component-to-component calls I
think we'll probably "fake" the second argument as the same as the first
argument, losing track of the original caller, as an intentional way of
isolating components from each other.

Along the way there are a few host locations which do actually assume
that the first argument is indeed a `VMContext`. These are valid
assumptions that are upheld from a correct implementation, but I opted
to add a "magic" field to `VMContext` to assert this in debug mode. This
new "magic" field is inintialized during normal vmcontext initialization
and it's checked whenever a `VMContext` is reinterpreted as an
`Instance` (but only in debug mode). My hope here is to catch any future
accidental mistakes, if ever.

* Use a VMOpaqueContext wrapper

* Fix typos
2022-06-01 11:00:43 -05:00
Alex Crichton
3f3afb455e Remove support for userfaultfd (#4040)
This commit removes support for the `userfaultfd` or "uffd" syscall on
Linux. This support was originally added for users migrating from Lucet
to Wasmtime, but the recent developments of kernel-supported
copy-on-write support for memory initialization wound up being more
appropriate for these use cases than usefaultfd. The main reason for
moving to copy-on-write initialization are:

* The `userfaultfd` feature was never necessarily intended for this
  style of use case with wasm and was susceptible to subtle and rare
  bugs that were extremely difficult to track down. We were never 100%
  certain that there were kernel bugs related to userfaultfd but the
  suspicion never went away.

* Handling faults with userfaultfd was always slow and single-threaded.
  Only one thread could handle faults and traveling to user-space to
  handle faults is inherently slower than handling them all in the
  kernel. The single-threaded aspect in particular presented a
  significant scaling bottleneck for embeddings that want to run many
  wasm instances in parallel.

* One of the major benefits of userfaultfd was lazy initialization of
  wasm linear memory which is also achieved with the copy-on-write
  initialization support we have right now.

* One of the suspected benefits of userfaultfd was less frobbing of the
  kernel vma structures when wasm modules are instantiated. Currently
  the copy-on-write support has a mitigation where we attempt to reuse
  the memory images where possible to avoid changing vma structures.
  When comparing this to userfaultfd's performance it was found that
  kernel modifications of vmas aren't a worrisome bottleneck so
  copy-on-write is suitable for this as well.

Overall there are no remaining benefits that userfaultfd gives that
copy-on-write doesn't, and copy-on-write solves a major downsides of
userfaultfd, the scaling issue with a single faulting thread.
Additionally copy-on-write support seems much more robust in terms of
kernel implementation since it's only using standard memory-management
syscalls which are heavily exercised. Finally copy-on-write support
provides a new bonus where read-only memory in WebAssembly can be mapped
directly to the same kernel cache page, even amongst many wasm instances
of the same module, which was never possible with userfaultfd.

In light of all this it's expected that all users of userfaultfd should
migrate to the copy-on-write initialization of Wasmtime (which is
enabled by default).
2022-04-18 12:42:26 -05:00
Alex Crichton
15bb0c6903 Remove the ModuleLimits pooling configuration structure (#3837)
* Remove the `ModuleLimits` pooling configuration structure

This commit is an attempt to improve the usability of the pooling
allocator by removing the need to configure a `ModuleLimits` structure.
Internally this structure has limits on all forms of wasm constructs but
this largely bottoms out in the size of an allocation for an instance in
the instance pooling allocator. Maintaining this list of limits can be
cumbersome as modules may get tweaked over time and there's otherwise no
real reason to limit the number of globals in a module since the main
goal is to limit the memory consumption of a `VMContext` which can be
done with a memory allocation limit rather than fine-tuned control over
each maximum and minimum.

The new approach taken in this commit is to remove `ModuleLimits`. Some
fields, such as `tables`, `table_elements` , `memories`, and
`memory_pages` are moved to `InstanceLimits` since they're still
enforced at runtime. A new field `size` is added to `InstanceLimits`
which indicates, in bytes, the maximum size of the `VMContext`
allocation. If the size of a `VMContext` for a module exceeds this value
then instantiation will fail.

This involved adding a few more checks to `{Table, Memory}::new_static`
to ensure that the minimum size is able to fit in the allocation, since
previously modules were validated at compile time of the module that
everything fit and that validation no longer happens (it happens at
runtime).

A consequence of this commit is that Wasmtime will have no built-in way
to reject modules at compile time if they'll fail to be instantiated
within a particular pooling allocator configuration. Instead a module
must attempt instantiation see if a failure happens.

* Fix benchmark compiles

* Fix some doc links

* Fix a panic by ensuring modules have limited tables/memories

* Review comments

* Add back validation at `Module` time instantiation is possible

This allows for getting an early signal at compile time that a module
will never be instantiable in an engine with matching settings.

* Provide a better error message when sizes are exceeded

Improve the error message when an instance size exceeds the maximum by
providing a breakdown of where the bytes are all going and why the large
size is being requested.

* Try to fix test in qemu

* Flag new test as 64-bit only

Sizes are all specific to 64-bit right now
2022-02-25 09:11:51 -06:00
Peter Huene
ef17a36852 Port fix for CVE-2022-23636 to main. (#3818)
* Port fix for `CVE-2022-23636` to `main`.

This commit ports the fix for `CVE-2022-23636` to `main`, but performs a
refactoring that makes it unnecessary for the instance itself to track if it
has been initialized; such a change was not targeted enough for a security
patch.

The pooling allocator will now only initialize an instance if all of its
associated resource creation succeeds. If the resource creation fails, no
instance is dropped as none was initialized.

Also updates `RELEASES.md` to include the related patch releases.

* Add `Instance::new_at` to fully initialize an instance.

Added `Instance::new_at` to fully initialize an instance at a given address.

This will hopefully prevent the possibility that an `Instance` structure
doesn't have an initialized `VMContext` when it is dropped.
2022-02-16 17:51:14 -06:00
Peter Huene
1b27508a42 Fix incorrect use of MemoryIndex in the pooling allocator. (#3782)
This commit corrects a few places where `MemoryIndex` was used and treated like
a `DefinedMemoryIndex` in the pooling instance allocator.

When the unstable `multi-memory` proposal is enabled, it is possible to cause a
newly allocated instance to use an incorrect base address for any defined
memories by having the module being instantiated also import a memory.

This requires enabling the unstable `multi-memory` proposal, configuring the
use of the pooling instance allocator (not the default), and then configuring
the module limits to allow imported memories (also not the default).

The fix is to replace all uses of `MemoryIndex` with `DefinedMemoryIndex` in
the pooling instance allocator.

Several `debug_assert!` have also been updated to `assert!` to sanity check the
state of the pooling allocator even in release builds.
2022-02-09 09:39:29 -06:00
Alex Crichton
da5c82b786 Fix a possible use-after-free introduced in #3231 (#3238)
In #3231 the wasm data sections were moved from the
`wasmtime_environ::Module` structure into the `CompilationArtifacts`.
Each `wasmtime_runtime::Instance` holds raw pointers into the data
section owned by the compilation artifacts under the assumption that the
runtime keeps the artifacts alive while the module is in use. Data is
needed beyond original initialization for `memory.init` instructions as
well as lazy-initialization with the `uffd` feature.

The intention of #3231 was that all `CompiledModule` structures, which
own `CompilationArtifacts` were owned by a store's `ModuleRegistry`, so
this was already taken care of. It turns out, however, that empty
modules which contain no functions are not held within a
`ModuleRegistry` since there was no need prior to retain them. This
commit remedies this mistake by retaining the `CompiledModule`
structure, even if there aren't any functions compiled in.

This should unblock #3235 and fixes the spurious error found there. The
test here, at least on Linux, will deterministically reproduce the error
before this commit since `uffd` was initializing wasm memory with free'd
host memory.
2021-08-25 12:14:13 -05:00
Anton Kirilov
cb93726250 Enable more tests on AArch64 (#2994)
Copyright (c) 2021, Arm Limited.
2021-06-21 12:26:44 -05:00
Alex Crichton
7ce46043dc Add guard pages to the front of linear memories (#2977)
* Add guard pages to the front of linear memories

This commit implements a safety feature for Wasmtime to place guard
pages before the allocation of all linear memories. Guard pages placed
after linear memories are typically present for performance (at least)
because it can help elide bounds checks. Guard pages before a linear
memory, however, are never strictly needed for performance or features.
The intention of a preceding guard page is to help insulate against bugs
in Cranelift or other code generators, such as CVE-2021-32629.

This commit adds a `Config::guard_before_linear_memory` configuration
option, defaulting to `true`, which indicates whether guard pages should
be present both before linear memories as well as afterwards. Guard
regions continue to be controlled by
`{static,dynamic}_memory_guard_size` methods.

The implementation here affects both on-demand allocated memories as
well as the pooling allocator for memories. For on-demand memories this
adjusts the size of the allocation as well as adjusts the calculations
for the base pointer of the wasm memory. For the pooling allocator this
will place a singular extra guard region at the very start of the
allocation for memories. Since linear memories in the pooling allocator
are contiguous every memory already had a preceding guard region in
memory, it was just the previous memory's guard region afterwards. Only
the first memory needed this extra guard.

I've attempted to write some tests to help test all this, but this is
all somewhat tricky to test because the settings are pretty far away
from the actual behavior. I think, though, that the tests added here
should help cover various use cases and help us have confidence in
tweaking the various `Config` settings beyond their defaults.

Note that this also contains a semantic change where
`InstanceLimits::memory_reservation_size` has been removed. Instead this
field is now inferred from the `static_memory_maximum_size` and guard
size settings. This should hopefully remove some duplication in these
settings, canonicalizing on the guard-size/static-size settings as the
way to control memory sizes and virtual reservations.

* Update config docs

* Fix a typo

* Fix benchmark

* Fix wasmtime-runtime tests

* Fix some more tests

* Try to fix uffd failing test

* Review items

* Tweak 32-bit defaults

Makes the pooling allocator a bit more reasonable by default on 32-bit
with these settings.
2021-06-18 09:57:08 -05:00
Alex Crichton
7a1b7cdf92 Implement RFC 11: Redesigning Wasmtime's APIs (#2897)
Implement Wasmtime's new API as designed by RFC 11. This is quite a large commit which has had lots of discussion externally, so for more information it's best to read the RFC thread and the PR thread.
2021-06-03 09:10:53 -05:00
Alex Crichton
2697a18d2f Redo the statically typed Func API (#2719)
* Redo the statically typed `Func` API

This commit reimplements the `Func` API with respect to statically typed
dispatch. Previously `Func` had a `getN` and `getN_async` family of
methods which were implemented for 0 to 16 parameters. The return value
of these functions was an `impl Fn(..)` closure with the appropriate
parameters and return values.

There are a number of downsides with this approach that have become
apparent over time:

* The addition of `*_async` doubled the API surface area (which is quite
  large here due to one-method-per-number-of-parameters).
* The [documentation of `Func`][old-docs] are quite verbose and feel
  "polluted" with all these getters, making it harder to understand the
  other methods that can be used to interact with a `Func`.
* These methods unconditionally pay the cost of returning an owned `impl
  Fn` with a `'static` lifetime. While cheap, this is still paying the
  cost for cloning the `Store` effectively and moving data into the
  closed-over environment.
* Storage of the return value into a struct, for example, always
  requires `Box`-ing the returned closure since it otherwise cannot be
  named.
* Recently I had the desire to implement an "unchecked" path for
  invoking wasm where you unsafely assert the type signature of a wasm
  function. Doing this with today's scheme would require doubling
  (again) the API surface area for both async and synchronous calls,
  further polluting the documentation.

The main benefit of the previous scheme is that by returning a `impl Fn`
it was quite easy and ergonomic to actually invoke the function. In
practice, though, examples would often have something akin to
`.get0::<()>()?()?` which is a lot of things to interpret all at once.
Note that `get0` means "0 parameters" yet a type parameter is passed.
There's also a double function invocation which looks like a lot of
characters all lined up in a row.

Overall, I think that the previous design is starting to show too many
cracks and deserves a rewrite. This commit is that rewrite.

The new design in this commit is to delete the `getN{,_async}` family of
functions and instead have a new API:

    impl Func {
        fn typed<P, R>(&self) -> Result<&Typed<P, R>>;
    }

    impl Typed<P, R> {
        fn call(&self, params: P) -> Result<R, Trap>;
        async fn call_async(&self, params: P) -> Result<R, Trap>;
    }

This should entirely replace the current scheme, albeit by slightly
losing ergonomics use cases. The idea behind the API is that the
existence of `Typed<P, R>` is a "proof" that the underlying function
takes `P` and returns `R`. The `Func::typed` method peforms a runtime
type-check to ensure that types all match up, and if successful you get
a `Typed` value. Otherwise an error is returned.

Once you have a `Typed` then, like `Func`, you can either `call` or
`call_async`. The difference with a `Typed`, however, is that the
params/results are statically known and hence these calls can be much
more efficient.

This is a much smaller API surface area from before and should greatly
simplify the `Func` documentation. There's still a problem where
`Func::wrapN_async` produces a lot of functions to document, but that's
now the sole offender. It's a nice benefit that the
statically-typed-async verisons are now expressed with an `async`
function rather than a function-returning-a-future which makes it both
more efficient and easier to understand.

The type `P` and `R` are intended to either be bare types (e.g. `i32`)
or tuples of any length (including 0). At this time `R` is only allowed
to be `()` or a bare `i32`-style type because multi-value is not
supported with a native ABI (yet). The `P`, however, can be any size of
tuples of parameters. This is also where some ergonomics are lost
because instead of `f(1, 2)` you now have to write `f.call((1, 2))`
(note the double-parens). Similarly `f()` becomes `f.call(())`.

Overall I feel that this is a better tradeoff than before. While not
universally better due to the loss in ergonomics I feel that this design
is much more flexible in terms of what you can do with the return value
and also understanding the API surface area (just less to take in).

[old-docs]: https://docs.rs/wasmtime/0.24.0/wasmtime/struct.Func.html#method.get0

* Rename Typed to TypedFunc

* Implement multi-value returns through `Func::typed`

* Fix examples in docs

* Fix some more errors

* More test fixes

* Rebasing and adding `get_typed_func`

* Updating tests

* Fix typo

* More doc tweaks

* Tweak visibility on `Func::invoke`

* Fix tests again
2021-03-11 14:43:34 -06:00
Peter Huene
54c07d8f16 Implement shared host functions. (#2625)
* Implement defining host functions at the Config level.

This commit introduces defining host functions at the `Config` rather than with
`Func` tied to a `Store`.

The intention here is to enable a host to define all of the functions once
with a `Config` and then use a `Linker` (or directly with
`Store::get_host_func`) to use the functions when instantiating a module.

This should help improve the performance of use cases where a `Store` is
short-lived and redefining the functions at every module instantiation is a
noticeable performance hit.

This commit adds `add_to_config` to the code generation for Wasmtime's `Wasi`
type.

The new method adds the WASI functions to the given config as host functions.

This commit adds context functions to `Store`: `get` to get a context of a
particular type and `set` to set the context on the store.

For safety, `set` cannot replace an existing context value of the same type.

`Wasi::set_context` was added to set the WASI context for a `Store` when using
`Wasi::add_to_config`.

* Add `Config::define_host_func_async`.

* Make config "async" rather than store.

This commit moves the concept of "async-ness" to `Config` rather than `Store`.

Note: this is a breaking API change for anyone that's already adopted the new
async support in Wasmtime.

Now `Config::new_async` is used to create an "async" config and any `Store`
associated with that config is inherently "async".

This is needed for async shared host functions to have some sanity check during their
execution (async host functions, like "async" `Func`, need to be called with
the "async" variants).

* Update async function tests to smoke async shared host functions.

This commit updates the async function tests to also smoke the shared host
functions, plus `Func::wrap0_async`.

This also changes the "wrap async" method names on `Config` to
`wrap$N_host_func_async` to slightly better match what is on `Func`.

* Move the instance allocator into `Engine`.

This commit moves the instantiated instance allocator from `Config` into
`Engine`.

This makes certain settings in `Config` no longer order-dependent, which is how
`Config` should ideally be.

This also removes the confusing concept of the "default" instance allocator,
instead opting to construct the on-demand instance allocator when needed.

This does alter the semantics of the instance allocator as now each `Engine`
gets its own instance allocator rather than sharing a single one between all
engines created from a configuration.

* Make `Engine::new` return `Result`.

This is a breaking API change for anyone using `Engine::new`.

As creating the pooling instance allocator may fail (likely cause is not enough
memory for the provided limits), instead of panicking when creating an
`Engine`, `Engine::new` now returns a `Result`.

* Remove `Config::new_async`.

This commit removes `Config::new_async` in favor of treating "async support" as
any other setting on `Config`.

The setting is `Config::async_support`.

* Remove order dependency when defining async host functions in `Config`.

This commit removes the order dependency where async support must be enabled on
the `Config` prior to defining async host functions.

The check is now delayed to when an `Engine` is created from the config.

* Update WASI example to use shared `Wasi::add_to_config`.

This commit updates the WASI example to use `Wasi::add_to_config`.

As only a single store and instance are used in the example, it has no semantic
difference from the previous example, but the intention is to steer users
towards defining WASI on the config and only using `Wasi::add_to_linker` when
more explicit scoping of the WASI context is required.
2021-03-11 10:14:03 -06:00
Peter Huene
a464465e2f Code review feedback changes.
* Add `anyhow` dependency to `wasmtime-runtime`.
* Revert `get_data` back to `fn`.
* Remove `DataInitializer` and box the data in `Module` translation instead.
* Improve comments on `MemoryInitialization`.
* Remove `MemoryInitialization::OutOfBounds` in favor of proper bulk memory
  semantics.
* Use segmented memory initialization except for when the uffd feature is
  enabled on Linux.
* Validate modules with the allocator after translation.
* Updated various functions in the runtime to return `anyhow::Result`.
* Use a slice when copying pages instead of `ptr::copy_nonoverlapping`.
* Remove unnecessary casts in `OnDemandAllocator::deallocate`.
* Better document the `uffd` feature.
* Use WebAssembly page-sized pages in the paged initialization.
* Remove the stack pool from the uffd handler and simply protect just the guard
  pages.
2021-03-04 18:19:46 -08:00
Peter Huene
505437e353 Code cleanup.
Last minute code clean up to fix some comments and rename `address_space_size`
to `memory_reservation_size` to better describe what the option is doing.
2021-03-04 18:19:46 -08:00
Peter Huene
e71ccbf9bc Implement the pooling instance allocator.
This commit implements the pooling instance allocator.

The allocation strategy can be set with `Config::with_allocation_strategy`.

The pooling strategy uses the pooling instance allocator to preallocate a
contiguous region of memory for instantiating modules that adhere to various
limits.

The intention of the pooling instance allocator is to reserve as much of the
host address space needed for instantiating modules ahead of time and to reuse
committed memory pages wherever possible.
2021-03-04 18:18:51 -08:00