* Always preserve frame pointers in Wasmtime
This allows us to efficiently and simply capture Wasm stacks without maintaining
and synchronizing any safety-critical side tables between the compiler and the
runtime.
* wasmtime: Implement fast Wasm stack walking
Why do we want Wasm stack walking to be fast? Because we capture stacks whenever
there is a trap and traps actually happen fairly frequently with short-lived
programs and WASI's `exit`.
Previously, we would rely on generating the system unwind info (e.g.
`.eh_frame`) and using the system unwinder (via the `backtrace`crate) to walk
the full stack and filter out any non-Wasm stack frames. This can,
unfortunately, be slow for two primary reasons:
1. The system unwinder is doing `O(all-kinds-of-frames)` work rather than
`O(wasm-frames)` work.
2. System unwind info and the system unwinder need to be much more general than
a purpose-built stack walker for Wasm needs to be. It has to handle any kind of
stack frame that any compiler might emit where as our Wasm frames are emitted by
Cranelift and always have frame pointers. This translates into implementation
complexity and general overhead. There can also be unnecessary-for-our-use-cases
global synchronization and locks involved, further slowing down stack walking in
the presence of multiple threads trying to capture stacks in parallel.
This commit introduces a purpose-built stack walker for traversing just our Wasm
frames. To find all the sequences of Wasm-to-Wasm stack frames, and ignore
non-Wasm stack frames, we keep a linked list of `(entry stack pointer, exit
frame pointer)` pairs. This linked list is maintained via Wasm-to-host and
host-to-Wasm trampolines. Within a sequence of Wasm-to-Wasm calls, we can use
frame pointers (which Cranelift preserves) to find the next older Wasm frame on
the stack, and we keep doing this until we reach the entry stack pointer,
meaning that the next older frame will be a host frame.
The trampolines need to avoid a couple stumbling blocks. First, they need to be
compiled ahead of time, since we may not have access to a compiler at
runtime (e.g. if the `cranelift` feature is disabled) but still want to be able
to call functions that have already been compiled and get stack traces for those
functions. Usually this means we would compile the appropriate trampolines
inside `Module::new` and the compiled module object would hold the
trampolines. However, we *also* need to support calling host functions that are
wrapped into `wasmtime::Func`s and there doesn't exist *any* ahead-of-time
compiled module object to hold the appropriate trampolines:
```rust
// Define a host function.
let func_type = wasmtime::FuncType::new(
vec![wasmtime::ValType::I32],
vec![wasmtime::ValType::I32],
);
let func = Func::new(&mut store, func_type, |_, params, results| {
// ...
Ok(())
});
// Call that host function.
let mut results = vec![wasmtime::Val::I32(0)];
func.call(&[wasmtime::Val::I32(0)], &mut results)?;
```
Therefore, we define one host-to-Wasm trampoline and one Wasm-to-host trampoline
in assembly that work for all Wasm and host function signatures. These
trampolines are careful to only use volatile registers, avoid touching any
register that is an argument in the calling convention ABI, and tail call to the
target callee function. This allows forwarding any set of arguments and any
returns to and from the callee, while also allowing us to maintain our linked
list of Wasm stack and frame pointers before transferring control to the
callee. These trampolines are not used in Wasm-to-Wasm calls, only when crossing
the host-Wasm boundary, so they do not impose overhead on regular calls. (And if
using one trampoline for all host-Wasm boundary crossing ever breaks branch
prediction enough in the CPU to become any kind of bottleneck, we can do fun
things like have multiple copies of the same trampoline and choose a random copy
for each function, sharding the functions across branch predictor entries.)
Finally, this commit also ends the use of a synthetic `Module` and allocating a
stubbed out `VMContext` for host functions. Instead, we define a
`VMHostFuncContext` with its own magic value, similar to `VMComponentContext`,
specifically for host functions.
<h2>Benchmarks</h2>
<h3>Traps and Stack Traces</h3>
Large improvements to taking stack traces on traps, ranging from shaving off 64%
to 99.95% of the time it used to take.
<details>
```
multi-threaded-traps/0 time: [2.5686 us 2.5808 us 2.5934 us]
thrpt: [0.0000 elem/s 0.0000 elem/s 0.0000 elem/s]
change:
time: [-85.419% -85.153% -84.869%] (p = 0.00 < 0.05)
thrpt: [+560.90% +573.56% +585.84%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
multi-threaded-traps/1 time: [2.9021 us 2.9167 us 2.9322 us]
thrpt: [341.04 Kelem/s 342.86 Kelem/s 344.58 Kelem/s]
change:
time: [-91.455% -91.294% -91.096%] (p = 0.00 < 0.05)
thrpt: [+1023.1% +1048.6% +1070.3%]
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
multi-threaded-traps/2 time: [2.9996 us 3.0145 us 3.0295 us]
thrpt: [660.18 Kelem/s 663.47 Kelem/s 666.76 Kelem/s]
change:
time: [-94.040% -93.910% -93.762%] (p = 0.00 < 0.05)
thrpt: [+1503.1% +1542.0% +1578.0%]
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high severe
multi-threaded-traps/4 time: [5.5768 us 5.6052 us 5.6364 us]
thrpt: [709.68 Kelem/s 713.63 Kelem/s 717.25 Kelem/s]
change:
time: [-93.193% -93.121% -93.052%] (p = 0.00 < 0.05)
thrpt: [+1339.2% +1353.6% +1369.1%]
Performance has improved.
multi-threaded-traps/8 time: [8.6408 us 9.1212 us 9.5438 us]
thrpt: [838.24 Kelem/s 877.08 Kelem/s 925.84 Kelem/s]
change:
time: [-94.754% -94.473% -94.202%] (p = 0.00 < 0.05)
thrpt: [+1624.7% +1709.2% +1806.1%]
Performance has improved.
multi-threaded-traps/16 time: [10.152 us 10.840 us 11.545 us]
thrpt: [1.3858 Melem/s 1.4760 Melem/s 1.5761 Melem/s]
change:
time: [-97.042% -96.823% -96.577%] (p = 0.00 < 0.05)
thrpt: [+2821.5% +3048.1% +3281.1%]
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
many-modules-registered-traps/1
time: [2.6278 us 2.6361 us 2.6447 us]
thrpt: [378.11 Kelem/s 379.35 Kelem/s 380.55 Kelem/s]
change:
time: [-85.311% -85.108% -84.909%] (p = 0.00 < 0.05)
thrpt: [+562.65% +571.51% +580.76%]
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severe
many-modules-registered-traps/8
time: [2.6294 us 2.6460 us 2.6623 us]
thrpt: [3.0049 Melem/s 3.0235 Melem/s 3.0425 Melem/s]
change:
time: [-85.895% -85.485% -85.022%] (p = 0.00 < 0.05)
thrpt: [+567.63% +588.95% +608.95%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
many-modules-registered-traps/64
time: [2.6218 us 2.6329 us 2.6452 us]
thrpt: [24.195 Melem/s 24.308 Melem/s 24.411 Melem/s]
change:
time: [-93.629% -93.551% -93.470%] (p = 0.00 < 0.05)
thrpt: [+1431.4% +1450.6% +1469.5%]
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
many-modules-registered-traps/512
time: [2.6569 us 2.6737 us 2.6923 us]
thrpt: [190.17 Melem/s 191.50 Melem/s 192.71 Melem/s]
change:
time: [-99.277% -99.268% -99.260%] (p = 0.00 < 0.05)
thrpt: [+13417% +13566% +13731%]
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
many-modules-registered-traps/4096
time: [2.7258 us 2.7390 us 2.7535 us]
thrpt: [1.4876 Gelem/s 1.4955 Gelem/s 1.5027 Gelem/s]
change:
time: [-99.956% -99.955% -99.955%] (p = 0.00 < 0.05)
thrpt: [+221417% +223380% +224881%]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
many-stack-frames-traps/1
time: [1.4658 us 1.4719 us 1.4784 us]
thrpt: [676.39 Kelem/s 679.38 Kelem/s 682.21 Kelem/s]
change:
time: [-90.368% -89.947% -89.586%] (p = 0.00 < 0.05)
thrpt: [+860.23% +894.72% +938.21%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
many-stack-frames-traps/8
time: [2.4772 us 2.4870 us 2.4973 us]
thrpt: [3.2034 Melem/s 3.2167 Melem/s 3.2294 Melem/s]
change:
time: [-85.550% -85.370% -85.199%] (p = 0.00 < 0.05)
thrpt: [+575.65% +583.51% +592.03%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
many-stack-frames-traps/64
time: [10.109 us 10.171 us 10.236 us]
thrpt: [6.2525 Melem/s 6.2925 Melem/s 6.3309 Melem/s]
change:
time: [-78.144% -77.797% -77.336%] (p = 0.00 < 0.05)
thrpt: [+341.22% +350.38% +357.55%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severe
many-stack-frames-traps/512
time: [126.16 us 126.54 us 126.96 us]
thrpt: [4.0329 Melem/s 4.0461 Melem/s 4.0583 Melem/s]
change:
time: [-65.364% -64.933% -64.453%] (p = 0.00 < 0.05)
thrpt: [+181.32% +185.17% +188.71%]
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high severe
```
</details>
<h3>Calls</h3>
There is, however, a small regression in raw Wasm-to-host and host-to-Wasm call
performance due the new trampolines. It seems to be on the order of about 2-10
nanoseconds per call, depending on the benchmark.
I believe this regression is ultimately acceptable because
1. this overhead will be vastly dominated by whatever work a non-nop callee
actually does,
2. we will need these trampolines, or something like them, when implementing the
Wasm exceptions proposal to do things like translate Wasm's exceptions into
Rust's `Result`s,
3. and because the performance improvements to trapping and capturing stack
traces are of such a larger magnitude than this call regressions.
<details>
```
sync/no-hook/host-to-wasm - typed - nop
time: [28.683 ns 28.757 ns 28.844 ns]
change: [+16.472% +17.183% +17.904%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
sync/no-hook/host-to-wasm - untyped - nop
time: [42.515 ns 42.652 ns 42.841 ns]
change: [+12.371% +14.614% +17.462%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) high mild
10 (10.00%) high severe
sync/no-hook/host-to-wasm - unchecked - nop
time: [33.936 ns 34.052 ns 34.179 ns]
change: [+25.478% +26.938% +28.369%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe
sync/no-hook/host-to-wasm - typed - nop-params-and-results
time: [34.290 ns 34.388 ns 34.502 ns]
change: [+40.802% +42.706% +44.526%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
5 (5.00%) high mild
8 (8.00%) high severe
sync/no-hook/host-to-wasm - untyped - nop-params-and-results
time: [62.546 ns 62.721 ns 62.919 ns]
change: [+2.5014% +3.6319% +4.8078%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) high mild
10 (10.00%) high severe
sync/no-hook/host-to-wasm - unchecked - nop-params-and-results
time: [42.609 ns 42.710 ns 42.831 ns]
change: [+20.966% +22.282% +23.475%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe
sync/hook-sync/host-to-wasm - typed - nop
time: [29.546 ns 29.675 ns 29.818 ns]
change: [+20.693% +21.794% +22.836%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
sync/hook-sync/host-to-wasm - untyped - nop
time: [45.448 ns 45.699 ns 45.961 ns]
change: [+17.204% +18.514% +19.590%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severe
sync/hook-sync/host-to-wasm - unchecked - nop
time: [34.334 ns 34.437 ns 34.558 ns]
change: [+23.225% +24.477% +25.886%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
5 (5.00%) high mild
7 (7.00%) high severe
sync/hook-sync/host-to-wasm - typed - nop-params-and-results
time: [36.594 ns 36.763 ns 36.974 ns]
change: [+41.967% +47.261% +52.086%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) high mild
9 (9.00%) high severe
sync/hook-sync/host-to-wasm - untyped - nop-params-and-results
time: [63.541 ns 63.831 ns 64.194 ns]
change: [-4.4337% -0.6855% +2.7134%] (p = 0.73 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
6 (6.00%) high mild
2 (2.00%) high severe
sync/hook-sync/host-to-wasm - unchecked - nop-params-and-results
time: [43.968 ns 44.169 ns 44.437 ns]
change: [+18.772% +21.802% +24.623%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) high mild
12 (12.00%) high severe
async/no-hook/host-to-wasm - typed - nop
time: [4.9612 us 4.9743 us 4.9889 us]
change: [+9.9493% +11.911% +13.502%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
async/no-hook/host-to-wasm - untyped - nop
time: [5.0030 us 5.0211 us 5.0439 us]
change: [+10.841% +11.873% +12.977%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
async/no-hook/host-to-wasm - typed - nop-params-and-results
time: [4.9273 us 4.9468 us 4.9700 us]
change: [+4.7381% +6.8445% +8.8238%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) high mild
9 (9.00%) high severe
async/no-hook/host-to-wasm - untyped - nop-params-and-results
time: [5.1151 us 5.1338 us 5.1555 us]
change: [+9.5335% +11.290% +13.044%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) high mild
13 (13.00%) high severe
async/hook-sync/host-to-wasm - typed - nop
time: [4.9330 us 4.9394 us 4.9467 us]
change: [+10.046% +11.038% +12.035%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
5 (5.00%) high mild
7 (7.00%) high severe
async/hook-sync/host-to-wasm - untyped - nop
time: [5.0073 us 5.0183 us 5.0310 us]
change: [+9.3828% +10.565% +11.752%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
async/hook-sync/host-to-wasm - typed - nop-params-and-results
time: [4.9610 us 4.9839 us 5.0097 us]
change: [+9.0857% +11.513% +14.359%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) high mild
6 (6.00%) high severe
async/hook-sync/host-to-wasm - untyped - nop-params-and-results
time: [5.0995 us 5.1272 us 5.1617 us]
change: [+9.3600% +11.506% +13.809%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
async-pool/no-hook/host-to-wasm - typed - nop
time: [2.4242 us 2.4316 us 2.4396 us]
change: [+7.8756% +8.8803% +9.8346%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
async-pool/no-hook/host-to-wasm - untyped - nop
time: [2.5102 us 2.5155 us 2.5210 us]
change: [+12.130% +13.194% +14.270%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) high mild
8 (8.00%) high severe
async-pool/no-hook/host-to-wasm - typed - nop-params-and-results
time: [2.4203 us 2.4310 us 2.4440 us]
change: [+4.0380% +6.3623% +8.7534%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) high mild
9 (9.00%) high severe
async-pool/no-hook/host-to-wasm - untyped - nop-params-and-results
time: [2.5501 us 2.5593 us 2.5700 us]
change: [+8.8802% +10.976% +12.937%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) high mild
11 (11.00%) high severe
async-pool/hook-sync/host-to-wasm - typed - nop
time: [2.4135 us 2.4190 us 2.4254 us]
change: [+8.3640% +9.3774% +10.435%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
6 (6.00%) high mild
5 (5.00%) high severe
async-pool/hook-sync/host-to-wasm - untyped - nop
time: [2.5172 us 2.5248 us 2.5357 us]
change: [+11.543% +12.750% +13.982%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) high mild
7 (7.00%) high severe
async-pool/hook-sync/host-to-wasm - typed - nop-params-and-results
time: [2.4214 us 2.4353 us 2.4532 us]
change: [+1.5158% +5.0872% +8.6765%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe
async-pool/hook-sync/host-to-wasm - untyped - nop-params-and-results
time: [2.5499 us 2.5607 us 2.5748 us]
change: [+10.146% +12.459% +14.919%] (p = 0.00 < 0.05)
Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
3 (3.00%) high mild
15 (15.00%) high severe
sync/no-hook/wasm-to-host - nop - typed
time: [6.6135 ns 6.6288 ns 6.6452 ns]
change: [+37.927% +38.837% +39.869%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - typed
time: [15.930 ns 15.993 ns 16.067 ns]
change: [+3.9583% +5.6286% +7.2430%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
11 (11.00%) high mild
1 (1.00%) high severe
sync/no-hook/wasm-to-host - nop - untyped
time: [20.596 ns 20.640 ns 20.690 ns]
change: [+4.3293% +5.2047% +6.0935%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - untyped
time: [42.659 ns 42.882 ns 43.159 ns]
change: [-2.1466% -0.5079% +1.2554%] (p = 0.58 > 0.05)
No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) high mild
14 (14.00%) high severe
sync/no-hook/wasm-to-host - nop - unchecked
time: [10.671 ns 10.691 ns 10.713 ns]
change: [+83.911% +87.620% +92.062%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe
sync/no-hook/wasm-to-host - nop-params-and-results - unchecked
time: [11.136 ns 11.190 ns 11.263 ns]
change: [-29.719% -28.446% -27.029%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severe
sync/hook-sync/wasm-to-host - nop - typed
time: [6.7964 ns 6.8087 ns 6.8226 ns]
change: [+21.531% +24.206% +27.331%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - typed
time: [15.865 ns 15.921 ns 15.985 ns]
change: [+4.8466% +6.3330% +7.8317%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) high mild
13 (13.00%) high severe
sync/hook-sync/wasm-to-host - nop - untyped
time: [21.505 ns 21.587 ns 21.677 ns]
change: [+8.0908% +9.1943% +10.254%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - untyped
time: [44.018 ns 44.128 ns 44.261 ns]
change: [-1.4671% -0.0458% +1.2443%] (p = 0.94 > 0.05)
No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) high mild
9 (9.00%) high severe
sync/hook-sync/wasm-to-host - nop - unchecked
time: [11.264 ns 11.326 ns 11.387 ns]
change: [+80.225% +81.659% +83.068%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
sync/hook-sync/wasm-to-host - nop-params-and-results - unchecked
time: [11.816 ns 11.865 ns 11.920 ns]
change: [-29.152% -28.040% -26.957%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
8 (8.00%) high mild
6 (6.00%) high severe
async/no-hook/wasm-to-host - nop - typed
time: [6.6221 ns 6.6385 ns 6.6569 ns]
change: [+43.618% +44.755% +45.965%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
6 (6.00%) high mild
7 (7.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - typed
time: [15.884 ns 15.929 ns 15.983 ns]
change: [+3.5987% +5.2053% +6.7846%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) high mild
13 (13.00%) high severe
async/no-hook/wasm-to-host - nop - untyped
time: [20.615 ns 20.702 ns 20.821 ns]
change: [+6.9799% +8.1212% +9.2819%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) high mild
8 (8.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - untyped
time: [41.956 ns 42.207 ns 42.521 ns]
change: [-4.3057% -2.7730% -1.2428%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
3 (3.00%) high mild
11 (11.00%) high severe
async/no-hook/wasm-to-host - nop - unchecked
time: [10.440 ns 10.474 ns 10.513 ns]
change: [+83.959% +85.826% +87.541%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - unchecked
time: [11.476 ns 11.512 ns 11.554 ns]
change: [-29.857% -28.383% -26.978%] (p = 0.00 < 0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
6 (6.00%) high mild
5 (5.00%) high severe
async/no-hook/wasm-to-host - nop - async-typed
time: [26.427 ns 26.478 ns 26.532 ns]
change: [+6.5730% +7.4676% +8.3983%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe
async/no-hook/wasm-to-host - nop-params-and-results - async-typed
time: [28.557 ns 28.693 ns 28.880 ns]
change: [+1.9099% +3.7332% +5.9731%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) high mild
14 (14.00%) high severe
async/hook-sync/wasm-to-host - nop - typed
time: [6.7488 ns 6.7630 ns 6.7784 ns]
change: [+19.935% +22.080% +23.683%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - typed
time: [15.928 ns 16.031 ns 16.149 ns]
change: [+5.5188% +6.9567% +8.3839%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
9 (9.00%) high mild
2 (2.00%) high severe
async/hook-sync/wasm-to-host - nop - untyped
time: [21.930 ns 22.114 ns 22.296 ns]
change: [+4.6674% +7.7588% +10.375%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - untyped
time: [42.684 ns 42.858 ns 43.081 ns]
change: [-5.2957% -3.4693% -1.6217%] (p = 0.00 < 0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) high mild
12 (12.00%) high severe
async/hook-sync/wasm-to-host - nop - unchecked
time: [11.026 ns 11.053 ns 11.086 ns]
change: [+70.751% +72.378% +73.961%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - unchecked
time: [11.840 ns 11.900 ns 11.982 ns]
change: [-27.977% -26.584% -24.887%] (p = 0.00 < 0.05)
Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
3 (3.00%) high mild
15 (15.00%) high severe
async/hook-sync/wasm-to-host - nop - async-typed
time: [27.601 ns 27.709 ns 27.882 ns]
change: [+8.1781% +9.1102% +10.030%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) low mild
3 (3.00%) high mild
6 (6.00%) high severe
async/hook-sync/wasm-to-host - nop-params-and-results - async-typed
time: [28.955 ns 29.174 ns 29.413 ns]
change: [+1.1226% +3.0366% +5.1126%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) high mild
6 (6.00%) high severe
async-pool/no-hook/wasm-to-host - nop - typed
time: [6.5626 ns 6.5733 ns 6.5851 ns]
change: [+40.561% +42.307% +44.514%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) high mild
4 (4.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - typed
time: [15.820 ns 15.886 ns 15.969 ns]
change: [+4.1044% +5.7928% +7.7122%] (p = 0.00 < 0.05)
Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
4 (4.00%) high mild
13 (13.00%) high severe
async-pool/no-hook/wasm-to-host - nop - untyped
time: [20.481 ns 20.521 ns 20.566 ns]
change: [+6.7962% +7.6950% +8.7612%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
6 (6.00%) high mild
5 (5.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - untyped
time: [41.834 ns 41.998 ns 42.189 ns]
change: [-3.8185% -2.2687% -0.7541%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) high mild
10 (10.00%) high severe
async-pool/no-hook/wasm-to-host - nop - unchecked
time: [10.353 ns 10.380 ns 10.414 ns]
change: [+82.042% +84.591% +87.205%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - unchecked
time: [11.123 ns 11.168 ns 11.228 ns]
change: [-30.813% -29.285% -27.874%] (p = 0.00 < 0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
11 (11.00%) high mild
1 (1.00%) high severe
async-pool/no-hook/wasm-to-host - nop - async-typed
time: [27.442 ns 27.528 ns 27.638 ns]
change: [+7.5215% +9.9795% +12.266%] (p = 0.00 < 0.05)
Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
3 (3.00%) high mild
15 (15.00%) high severe
async-pool/no-hook/wasm-to-host - nop-params-and-results - async-typed
time: [29.014 ns 29.148 ns 29.312 ns]
change: [+2.0227% +3.4722% +4.9047%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - typed
time: [6.7916 ns 6.8116 ns 6.8325 ns]
change: [+20.937% +22.050% +23.281%] (p = 0.00 < 0.05)
Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - typed
time: [15.917 ns 15.975 ns 16.051 ns]
change: [+4.6404% +6.4217% +8.3075%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) high mild
11 (11.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - untyped
time: [21.558 ns 21.612 ns 21.679 ns]
change: [+8.1158% +9.1409% +10.217%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - untyped
time: [42.475 ns 42.614 ns 42.775 ns]
change: [-6.3613% -4.4709% -2.7647%] (p = 0.00 < 0.05)
Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
3 (3.00%) high mild
15 (15.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - unchecked
time: [11.150 ns 11.195 ns 11.247 ns]
change: [+74.424% +77.056% +79.811%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
3 (3.00%) high mild
11 (11.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - unchecked
time: [11.639 ns 11.695 ns 11.760 ns]
change: [-30.212% -29.023% -27.954%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
7 (7.00%) high mild
8 (8.00%) high severe
async-pool/hook-sync/wasm-to-host - nop - async-typed
time: [27.480 ns 27.712 ns 27.984 ns]
change: [+2.9764% +6.5061% +9.8914%] (p = 0.00 < 0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
6 (6.00%) high mild
2 (2.00%) high severe
async-pool/hook-sync/wasm-to-host - nop-params-and-results - async-typed
time: [29.218 ns 29.380 ns 29.600 ns]
change: [+5.2283% +7.7247% +10.822%] (p = 0.00 < 0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) high mild
14 (14.00%) high severe
```
</details>
* Add s390x support for frame pointer-based stack walking
* wasmtime: Allow `Caller::get_export` to get all exports
* fuzzing: Add a fuzz target to check that our stack traces are correct
We generate Wasm modules that keep track of their own stack as they call and
return between functions, and then we periodically check that if the host
captures a backtrace, it matches what the Wasm module has recorded.
* Remove VM offsets for `VMHostFuncContext` since it isn't used by JIT code
* Add doc comment with stack walking implementation notes
* Document the extra state that can be passed to `wasmtime_runtime::Backtrace` methods
* Add extensive comments for stack walking function
* Factor architecture-specific bits of stack walking out into modules
* Initialize store-related fields in a vmctx to null when there is no store yet
Rather than leaving them as uninitialized data.
* Use `set_callee` instead of manually setting the vmctx field
* Use a more informative compile error message for unsupported architectures
* Document unsafety of `prepare_host_to_wasm_trampoline`
* Use `bti c` instead of `hint #34` in inline aarch64 assembly
* Remove outdated TODO comment
* Remove setting of `last_wasm_exit_fp` in `set_jit_trap`
This is no longer needed as the value is plumbed through to the backtrace code
directly now.
* Only set the stack limit once, in the face of re-entrancy into Wasm
* Add comments for s390x-specific stack walking bits
* Use the helper macro for all libcalls
If we forget to use it, and then trigger a GC from the libcall, that means we
could miss stack frames when walking the stack, fail to find live GC refs, and
then get use after free bugs. Much less risky to always use the helper macro
that takes care of all of that for us.
* Use the `asm_sym!` macro in Wasm-to-libcall trampolines
This macro handles the macOS-specific underscore prefix stuff for us.
* wasmtime: add size and align to `externref` assertion error message
* Extend the `stacks` fuzzer to have host frames in between Wasm frames
This way we get one or more contiguous sequences of Wasm frames on the stack,
instead of exactly one.
* Add documentation for aarch64-specific backtrace helpers
* Clarify that we only support little-endian aarch64 in trampoline comment
* Use `.machine z13` in s390x assembly file
Since apparently our CI machines have pretty old assemblers that don't have
`.machine z14`. This should be fine though since these trampolines don't make
use of anything that is introduced in z14.
* Fix aarch64 build
* Fix macOS build
* Document the `asm_sym!` macro
* Add windows support to the `wasmtime-asm-macros` crate
* Add windows support to host<--->Wasm trampolines
* Fix trap handler build on windows
* Run `rustfmt` on s390x trampoline source file
* Temporarily disable some assertions about a trap's backtrace in the component model tests
Follow up to re-enable this and fix the associated issue:
https://github.com/bytecodealliance/wasmtime/issues/4535
* Refactor libcall definitions with less macros
This refactors the `libcall!` macro to use the
`foreach_builtin_function!` macro to define all of the trampolines.
Additionally the macro surrounding each libcall itself is no longer
necessary and helps avoid too many macros.
* Use `VMOpaqueContext::from_vm_host_func_context` in `VMHostFuncContext::new`
* Move `backtrace` module to be submodule of `traphandlers`
This avoids making some things `pub(crate)` in `traphandlers` that really
shouldn't be.
* Fix macOS aarch64 build
* Use "i64" instead of "word" in aarch64-specific file
* Save/restore entry SP and exit FP/return pointer in the face of panicking imported host functions
Also clean up assertions surrounding our saved entry/exit registers.
* Put "typed" vs "untyped" in the same position of call benchmark names
Regardless if we are doing wasm-to-host or host-to-wasm
* Fix stacks test case generator build for new `wasm-encoder`
* Fix build for s390x
* Expand libcalls in s390x asm
* Disable more parts of component tests now that backtrace assertions are a bit tighter
* Remove assertion that can maybe fail on s390x
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
1193 lines
45 KiB
Rust
1193 lines
45 KiB
Rust
//! An `Instance` contains all the runtime state used by execution of a
|
|
//! wasm module (except its callstack and register state). An
|
|
//! `InstanceHandle` is a reference-counting handle for an `Instance`.
|
|
|
|
use crate::export::Export;
|
|
use crate::externref::VMExternRefActivationsTable;
|
|
use crate::memory::{Memory, RuntimeMemoryCreator};
|
|
use crate::table::{Table, TableElement, TableElementType};
|
|
use crate::vmcontext::{
|
|
VMBuiltinFunctionsArray, VMCallerCheckedAnyfunc, VMContext, VMFunctionImport,
|
|
VMGlobalDefinition, VMGlobalImport, VMMemoryDefinition, VMMemoryImport, VMOpaqueContext,
|
|
VMRuntimeLimits, VMTableDefinition, VMTableImport, VMCONTEXT_MAGIC,
|
|
};
|
|
use crate::{
|
|
ExportFunction, ExportGlobal, ExportMemory, ExportTable, Imports, ModuleRuntimeInfo, Store,
|
|
VMFunctionBody,
|
|
};
|
|
use anyhow::Error;
|
|
use memoffset::offset_of;
|
|
use std::alloc::Layout;
|
|
use std::any::Any;
|
|
use std::convert::TryFrom;
|
|
use std::hash::Hash;
|
|
use std::ops::Range;
|
|
use std::ptr::NonNull;
|
|
use std::sync::atomic::AtomicU64;
|
|
use std::sync::Arc;
|
|
use std::{mem, ptr};
|
|
use wasmtime_environ::{
|
|
packed_option::ReservedValue, DataIndex, DefinedGlobalIndex, DefinedMemoryIndex,
|
|
DefinedTableIndex, ElemIndex, EntityIndex, EntityRef, EntitySet, FuncIndex, GlobalIndex,
|
|
GlobalInit, HostPtr, MemoryIndex, Module, PrimaryMap, SignatureIndex, TableIndex,
|
|
TableInitialization, TrapCode, VMOffsets, WasmType,
|
|
};
|
|
|
|
mod allocator;
|
|
|
|
pub use allocator::*;
|
|
|
|
/// A type that roughly corresponds to a WebAssembly instance, but is also used
|
|
/// for host-defined objects.
|
|
///
|
|
/// This structure is is never allocated directly but is instead managed through
|
|
/// an `InstanceHandle`. This structure ends with a `VMContext` which has a
|
|
/// dynamic size corresponding to the `module` configured within. Memory
|
|
/// management of this structure is always externalized.
|
|
///
|
|
/// Instances here can correspond to actual instantiated modules, but it's also
|
|
/// used ubiquitously for host-defined objects. For example creating a
|
|
/// host-defined memory will have a `module` that looks like it exports a single
|
|
/// memory (and similar for other constructs).
|
|
///
|
|
/// This `Instance` type is used as a ubiquitous representation for WebAssembly
|
|
/// values, whether or not they were created on the host or through a module.
|
|
#[repr(C)] // ensure that the vmctx field is last.
|
|
pub(crate) struct Instance {
|
|
/// The runtime info (corresponding to the "compiled module"
|
|
/// abstraction in higher layers) that is retained and needed for
|
|
/// lazy initialization. This provides access to the underlying
|
|
/// Wasm module entities, the compiled JIT code, metadata about
|
|
/// functions, lazy initialization state, etc.
|
|
runtime_info: Arc<dyn ModuleRuntimeInfo>,
|
|
|
|
/// Offsets in the `vmctx` region, precomputed from the `module` above.
|
|
offsets: VMOffsets<HostPtr>,
|
|
|
|
/// WebAssembly linear memory data.
|
|
///
|
|
/// This is where all runtime information about defined linear memories in
|
|
/// this module lives.
|
|
memories: PrimaryMap<DefinedMemoryIndex, Memory>,
|
|
|
|
/// WebAssembly table data.
|
|
///
|
|
/// Like memories, this is only for defined tables in the module and
|
|
/// contains all of their runtime state.
|
|
tables: PrimaryMap<DefinedTableIndex, Table>,
|
|
|
|
/// Stores the dropped passive element segments in this instantiation by index.
|
|
/// If the index is present in the set, the segment has been dropped.
|
|
dropped_elements: EntitySet<ElemIndex>,
|
|
|
|
/// Stores the dropped passive data segments in this instantiation by index.
|
|
/// If the index is present in the set, the segment has been dropped.
|
|
dropped_data: EntitySet<DataIndex>,
|
|
|
|
/// Hosts can store arbitrary per-instance information here.
|
|
///
|
|
/// Most of the time from Wasmtime this is `Box::new(())`, a noop
|
|
/// allocation, but some host-defined objects will store their state here.
|
|
host_state: Box<dyn Any + Send + Sync>,
|
|
|
|
/// Additional context used by compiled wasm code. This field is last, and
|
|
/// represents a dynamically-sized array that extends beyond the nominal
|
|
/// end of the struct (similar to a flexible array member).
|
|
vmctx: VMContext,
|
|
}
|
|
|
|
#[allow(clippy::cast_ptr_alignment)]
|
|
impl Instance {
|
|
/// Create an instance at the given memory address.
|
|
///
|
|
/// It is assumed the memory was properly aligned and the
|
|
/// allocation was `alloc_size` in bytes.
|
|
unsafe fn new_at(
|
|
ptr: *mut Instance,
|
|
alloc_size: usize,
|
|
offsets: VMOffsets<HostPtr>,
|
|
req: InstanceAllocationRequest,
|
|
memories: PrimaryMap<DefinedMemoryIndex, Memory>,
|
|
tables: PrimaryMap<DefinedTableIndex, Table>,
|
|
) {
|
|
// The allocation must be *at least* the size required of `Instance`.
|
|
assert!(alloc_size >= Self::alloc_layout(&offsets).size());
|
|
|
|
let module = req.runtime_info.module();
|
|
let dropped_elements = EntitySet::with_capacity(module.passive_elements.len());
|
|
let dropped_data = EntitySet::with_capacity(module.passive_data_map.len());
|
|
|
|
ptr::write(
|
|
ptr,
|
|
Instance {
|
|
runtime_info: req.runtime_info.clone(),
|
|
offsets,
|
|
memories,
|
|
tables,
|
|
dropped_elements,
|
|
dropped_data,
|
|
host_state: req.host_state,
|
|
vmctx: VMContext {
|
|
_marker: std::marker::PhantomPinned,
|
|
},
|
|
},
|
|
);
|
|
|
|
(*ptr).initialize_vmctx(module, req.store, req.imports);
|
|
}
|
|
|
|
/// Helper function to access various locations offset from our `*mut
|
|
/// VMContext` object.
|
|
unsafe fn vmctx_plus_offset<T>(&self, offset: u32) -> *mut T {
|
|
(self.vmctx_ptr().cast::<u8>())
|
|
.add(usize::try_from(offset).unwrap())
|
|
.cast()
|
|
}
|
|
|
|
pub(crate) fn module(&self) -> &Arc<Module> {
|
|
self.runtime_info.module()
|
|
}
|
|
|
|
/// Return the indexed `VMFunctionImport`.
|
|
fn imported_function(&self, index: FuncIndex) -> &VMFunctionImport {
|
|
unsafe { &*self.vmctx_plus_offset(self.offsets.vmctx_vmfunction_import(index)) }
|
|
}
|
|
|
|
/// Return the index `VMTableImport`.
|
|
fn imported_table(&self, index: TableIndex) -> &VMTableImport {
|
|
unsafe { &*self.vmctx_plus_offset(self.offsets.vmctx_vmtable_import(index)) }
|
|
}
|
|
|
|
/// Return the indexed `VMMemoryImport`.
|
|
fn imported_memory(&self, index: MemoryIndex) -> &VMMemoryImport {
|
|
unsafe { &*self.vmctx_plus_offset(self.offsets.vmctx_vmmemory_import(index)) }
|
|
}
|
|
|
|
/// Return the indexed `VMGlobalImport`.
|
|
fn imported_global(&self, index: GlobalIndex) -> &VMGlobalImport {
|
|
unsafe { &*self.vmctx_plus_offset(self.offsets.vmctx_vmglobal_import(index)) }
|
|
}
|
|
|
|
/// Return the indexed `VMTableDefinition`.
|
|
#[allow(dead_code)]
|
|
fn table(&self, index: DefinedTableIndex) -> VMTableDefinition {
|
|
unsafe { *self.table_ptr(index) }
|
|
}
|
|
|
|
/// Updates the value for a defined table to `VMTableDefinition`.
|
|
fn set_table(&self, index: DefinedTableIndex, table: VMTableDefinition) {
|
|
unsafe {
|
|
*self.table_ptr(index) = table;
|
|
}
|
|
}
|
|
|
|
/// Return the indexed `VMTableDefinition`.
|
|
fn table_ptr(&self, index: DefinedTableIndex) -> *mut VMTableDefinition {
|
|
unsafe { self.vmctx_plus_offset(self.offsets.vmctx_vmtable_definition(index)) }
|
|
}
|
|
|
|
/// Get a locally defined or imported memory.
|
|
pub(crate) fn get_memory(&self, index: MemoryIndex) -> VMMemoryDefinition {
|
|
if let Some(defined_index) = self.module().defined_memory_index(index) {
|
|
self.memory(defined_index)
|
|
} else {
|
|
let import = self.imported_memory(index);
|
|
unsafe { VMMemoryDefinition::load(import.from) }
|
|
}
|
|
}
|
|
|
|
/// Return the indexed `VMMemoryDefinition`.
|
|
fn memory(&self, index: DefinedMemoryIndex) -> VMMemoryDefinition {
|
|
unsafe { VMMemoryDefinition::load(self.memory_ptr(index)) }
|
|
}
|
|
|
|
/// Set the indexed memory to `VMMemoryDefinition`.
|
|
fn set_memory(&self, index: DefinedMemoryIndex, mem: VMMemoryDefinition) {
|
|
unsafe {
|
|
*self.memory_ptr(index) = mem;
|
|
}
|
|
}
|
|
|
|
/// Return the indexed `VMMemoryDefinition`.
|
|
fn memory_ptr(&self, index: DefinedMemoryIndex) -> *mut VMMemoryDefinition {
|
|
unsafe { *self.vmctx_plus_offset(self.offsets.vmctx_vmmemory_pointer(index)) }
|
|
}
|
|
|
|
/// Return the indexed `VMGlobalDefinition`.
|
|
fn global(&self, index: DefinedGlobalIndex) -> &VMGlobalDefinition {
|
|
unsafe { &*self.global_ptr(index) }
|
|
}
|
|
|
|
/// Return the indexed `VMGlobalDefinition`.
|
|
fn global_ptr(&self, index: DefinedGlobalIndex) -> *mut VMGlobalDefinition {
|
|
unsafe { self.vmctx_plus_offset(self.offsets.vmctx_vmglobal_definition(index)) }
|
|
}
|
|
|
|
/// Get a raw pointer to the global at the given index regardless whether it
|
|
/// is defined locally or imported from another module.
|
|
///
|
|
/// Panics if the index is out of bound or is the reserved value.
|
|
pub(crate) fn defined_or_imported_global_ptr(
|
|
&self,
|
|
index: GlobalIndex,
|
|
) -> *mut VMGlobalDefinition {
|
|
if let Some(index) = self.module().defined_global_index(index) {
|
|
self.global_ptr(index)
|
|
} else {
|
|
self.imported_global(index).from
|
|
}
|
|
}
|
|
|
|
/// Return a pointer to the interrupts structure
|
|
pub fn runtime_limits(&self) -> *mut *const VMRuntimeLimits {
|
|
unsafe { self.vmctx_plus_offset(self.offsets.vmctx_runtime_limits()) }
|
|
}
|
|
|
|
/// Return a pointer to the global epoch counter used by this instance.
|
|
pub fn epoch_ptr(&self) -> *mut *const AtomicU64 {
|
|
unsafe { self.vmctx_plus_offset(self.offsets.vmctx_epoch_ptr()) }
|
|
}
|
|
|
|
/// Return a pointer to the `VMExternRefActivationsTable`.
|
|
pub fn externref_activations_table(&self) -> *mut *mut VMExternRefActivationsTable {
|
|
unsafe { self.vmctx_plus_offset(self.offsets.vmctx_externref_activations_table()) }
|
|
}
|
|
|
|
/// Gets a pointer to this instance's `Store` which was originally
|
|
/// configured on creation.
|
|
///
|
|
/// # Panics
|
|
///
|
|
/// This will panic if the originally configured store was `None`. That can
|
|
/// happen for host functions so host functions can't be queried what their
|
|
/// original `Store` was since it's just retained as null (since host
|
|
/// functions are shared amongst threads and don't all share the same
|
|
/// store).
|
|
#[inline]
|
|
pub fn store(&self) -> *mut dyn Store {
|
|
let ptr = unsafe { *self.vmctx_plus_offset::<*mut dyn Store>(self.offsets.vmctx_store()) };
|
|
assert!(!ptr.is_null());
|
|
ptr
|
|
}
|
|
|
|
pub unsafe fn set_store(&mut self, store: Option<*mut dyn Store>) {
|
|
if let Some(store) = store {
|
|
*self.vmctx_plus_offset(self.offsets.vmctx_store()) = store;
|
|
*self.runtime_limits() = (*store).vmruntime_limits();
|
|
*self.epoch_ptr() = (*store).epoch_ptr();
|
|
*self.externref_activations_table() = (*store).externref_activations_table().0;
|
|
} else {
|
|
assert_eq!(
|
|
mem::size_of::<*mut dyn Store>(),
|
|
mem::size_of::<[*mut (); 2]>()
|
|
);
|
|
*self.vmctx_plus_offset::<[*mut (); 2]>(self.offsets.vmctx_store()) =
|
|
[ptr::null_mut(), ptr::null_mut()];
|
|
|
|
*self.runtime_limits() = ptr::null_mut();
|
|
*self.epoch_ptr() = ptr::null_mut();
|
|
*self.externref_activations_table() = ptr::null_mut();
|
|
}
|
|
}
|
|
|
|
pub(crate) unsafe fn set_callee(&mut self, callee: Option<NonNull<VMFunctionBody>>) {
|
|
*self.vmctx_plus_offset(self.offsets.vmctx_callee()) =
|
|
callee.map_or(ptr::null_mut(), |c| c.as_ptr());
|
|
}
|
|
|
|
/// Return a reference to the vmctx used by compiled wasm code.
|
|
#[inline]
|
|
pub fn vmctx(&self) -> &VMContext {
|
|
&self.vmctx
|
|
}
|
|
|
|
/// Return a raw pointer to the vmctx used by compiled wasm code.
|
|
#[inline]
|
|
pub fn vmctx_ptr(&self) -> *mut VMContext {
|
|
self.vmctx() as *const VMContext as *mut VMContext
|
|
}
|
|
|
|
fn get_exported_func(&mut self, index: FuncIndex) -> ExportFunction {
|
|
let anyfunc = self.get_caller_checked_anyfunc(index).unwrap();
|
|
let anyfunc = NonNull::new(anyfunc as *const VMCallerCheckedAnyfunc as *mut _).unwrap();
|
|
ExportFunction { anyfunc }
|
|
}
|
|
|
|
fn get_exported_table(&mut self, index: TableIndex) -> ExportTable {
|
|
let (definition, vmctx) = if let Some(def_index) = self.module().defined_table_index(index)
|
|
{
|
|
(self.table_ptr(def_index), self.vmctx_ptr())
|
|
} else {
|
|
let import = self.imported_table(index);
|
|
(import.from, import.vmctx)
|
|
};
|
|
ExportTable {
|
|
definition,
|
|
vmctx,
|
|
table: self.module().table_plans[index].clone(),
|
|
}
|
|
}
|
|
|
|
fn get_exported_memory(&mut self, index: MemoryIndex) -> ExportMemory {
|
|
let (definition, vmctx, def_index) =
|
|
if let Some(def_index) = self.module().defined_memory_index(index) {
|
|
(self.memory_ptr(def_index), self.vmctx_ptr(), def_index)
|
|
} else {
|
|
let import = self.imported_memory(index);
|
|
(import.from, import.vmctx, import.index)
|
|
};
|
|
ExportMemory {
|
|
definition,
|
|
vmctx,
|
|
memory: self.module().memory_plans[index].clone(),
|
|
index: def_index,
|
|
}
|
|
}
|
|
|
|
fn get_exported_global(&mut self, index: GlobalIndex) -> ExportGlobal {
|
|
ExportGlobal {
|
|
definition: if let Some(def_index) = self.module().defined_global_index(index) {
|
|
self.global_ptr(def_index)
|
|
} else {
|
|
self.imported_global(index).from
|
|
},
|
|
global: self.module().globals[index],
|
|
}
|
|
}
|
|
|
|
/// Return an iterator over the exports of this instance.
|
|
///
|
|
/// Specifically, it provides access to the key-value pairs, where the keys
|
|
/// are export names, and the values are export declarations which can be
|
|
/// resolved `lookup_by_declaration`.
|
|
pub fn exports(&self) -> indexmap::map::Iter<String, EntityIndex> {
|
|
self.module().exports.iter()
|
|
}
|
|
|
|
/// Return a reference to the custom state attached to this instance.
|
|
#[inline]
|
|
pub fn host_state(&self) -> &dyn Any {
|
|
&*self.host_state
|
|
}
|
|
|
|
/// Return the offset from the vmctx pointer to its containing Instance.
|
|
#[inline]
|
|
pub(crate) fn vmctx_offset() -> isize {
|
|
offset_of!(Self, vmctx) as isize
|
|
}
|
|
|
|
/// Return the table index for the given `VMTableDefinition`.
|
|
unsafe fn table_index(&self, table: &VMTableDefinition) -> DefinedTableIndex {
|
|
let index = DefinedTableIndex::new(
|
|
usize::try_from(
|
|
(table as *const VMTableDefinition)
|
|
.offset_from(self.table_ptr(DefinedTableIndex::new(0))),
|
|
)
|
|
.unwrap(),
|
|
);
|
|
assert!(index.index() < self.tables.len());
|
|
index
|
|
}
|
|
|
|
/// Grow memory by the specified amount of pages.
|
|
///
|
|
/// Returns `None` if memory can't be grown by the specified amount
|
|
/// of pages. Returns `Some` with the old size in bytes if growth was
|
|
/// successful.
|
|
pub(crate) fn memory_grow(
|
|
&mut self,
|
|
index: MemoryIndex,
|
|
delta: u64,
|
|
) -> Result<Option<usize>, Error> {
|
|
let (idx, instance) = if let Some(idx) = self.module().defined_memory_index(index) {
|
|
(idx, self)
|
|
} else {
|
|
let import = self.imported_memory(index);
|
|
unsafe {
|
|
let foreign_instance = (*import.vmctx).instance_mut();
|
|
(import.index, foreign_instance)
|
|
}
|
|
};
|
|
let store = unsafe { &mut *instance.store() };
|
|
let memory = &mut instance.memories[idx];
|
|
|
|
let result = unsafe { memory.grow(delta, Some(store)) };
|
|
|
|
// Update the state used by a non-shared Wasm memory in case the base
|
|
// pointer and/or the length changed.
|
|
if memory.as_shared_memory().is_none() {
|
|
let vmmemory = memory.vmmemory();
|
|
instance.set_memory(idx, vmmemory);
|
|
}
|
|
|
|
result
|
|
}
|
|
|
|
pub(crate) fn table_element_type(&mut self, table_index: TableIndex) -> TableElementType {
|
|
unsafe { (*self.get_table(table_index)).element_type() }
|
|
}
|
|
|
|
/// Grow table by the specified amount of elements, filling them with
|
|
/// `init_value`.
|
|
///
|
|
/// Returns `None` if table can't be grown by the specified amount of
|
|
/// elements, or if `init_value` is the wrong type of table element.
|
|
pub(crate) fn table_grow(
|
|
&mut self,
|
|
table_index: TableIndex,
|
|
delta: u32,
|
|
init_value: TableElement,
|
|
) -> Result<Option<u32>, Error> {
|
|
let (defined_table_index, instance) =
|
|
self.get_defined_table_index_and_instance(table_index);
|
|
instance.defined_table_grow(defined_table_index, delta, init_value)
|
|
}
|
|
|
|
fn defined_table_grow(
|
|
&mut self,
|
|
table_index: DefinedTableIndex,
|
|
delta: u32,
|
|
init_value: TableElement,
|
|
) -> Result<Option<u32>, Error> {
|
|
let store = unsafe { &mut *self.store() };
|
|
let table = self
|
|
.tables
|
|
.get_mut(table_index)
|
|
.unwrap_or_else(|| panic!("no table for index {}", table_index.index()));
|
|
|
|
let result = unsafe { table.grow(delta, init_value, store) };
|
|
|
|
// Keep the `VMContext` pointers used by compiled Wasm code up to
|
|
// date.
|
|
let element = self.tables[table_index].vmtable();
|
|
self.set_table(table_index, element);
|
|
|
|
result
|
|
}
|
|
|
|
fn alloc_layout(offsets: &VMOffsets<HostPtr>) -> Layout {
|
|
let size = mem::size_of::<Self>()
|
|
.checked_add(usize::try_from(offsets.size_of_vmctx()).unwrap())
|
|
.unwrap();
|
|
let align = mem::align_of::<Self>();
|
|
Layout::from_size_align(size, align).unwrap()
|
|
}
|
|
|
|
/// Construct a new VMCallerCheckedAnyfunc for the given function
|
|
/// (imported or defined in this module) and store into the given
|
|
/// location. Used during lazy initialization.
|
|
///
|
|
/// Note that our current lazy-init scheme actually calls this every
|
|
/// time the anyfunc pointer is fetched; this turns out to be better
|
|
/// than tracking state related to whether it's been initialized
|
|
/// before, because resetting that state on (re)instantiation is
|
|
/// very expensive if there are many anyfuncs.
|
|
fn construct_anyfunc(
|
|
&mut self,
|
|
index: FuncIndex,
|
|
sig: SignatureIndex,
|
|
into: *mut VMCallerCheckedAnyfunc,
|
|
) {
|
|
let type_index = self.runtime_info.signature(sig);
|
|
|
|
let (func_ptr, vmctx) = if let Some(def_index) = self.module().defined_func_index(index) {
|
|
(
|
|
(self.runtime_info.image_base()
|
|
+ self.runtime_info.function_info(def_index).start as usize)
|
|
as *mut _,
|
|
VMOpaqueContext::from_vmcontext(self.vmctx_ptr()),
|
|
)
|
|
} else {
|
|
let import = self.imported_function(index);
|
|
(import.body.as_ptr(), import.vmctx)
|
|
};
|
|
|
|
// Safety: we have a `&mut self`, so we have exclusive access
|
|
// to this Instance.
|
|
unsafe {
|
|
*into = VMCallerCheckedAnyfunc {
|
|
vmctx,
|
|
type_index,
|
|
func_ptr: NonNull::new(func_ptr).expect("Non-null function pointer"),
|
|
};
|
|
}
|
|
}
|
|
|
|
/// Get a `&VMCallerCheckedAnyfunc` for the given `FuncIndex`.
|
|
///
|
|
/// Returns `None` if the index is the reserved index value.
|
|
///
|
|
/// The returned reference is a stable reference that won't be moved and can
|
|
/// be passed into JIT code.
|
|
pub(crate) fn get_caller_checked_anyfunc(
|
|
&mut self,
|
|
index: FuncIndex,
|
|
) -> Option<*mut VMCallerCheckedAnyfunc> {
|
|
if index == FuncIndex::reserved_value() {
|
|
return None;
|
|
}
|
|
|
|
// Safety: we have a `&mut self`, so we have exclusive access
|
|
// to this Instance.
|
|
unsafe {
|
|
// For now, we eagerly initialize an anyfunc struct in-place
|
|
// whenever asked for a reference to it. This is mostly
|
|
// fine, because in practice each anyfunc is unlikely to be
|
|
// requested more than a few times: once-ish for funcref
|
|
// tables used for call_indirect (the usual compilation
|
|
// strategy places each function in the table at most once),
|
|
// and once or a few times when fetching exports via API.
|
|
// Note that for any case driven by table accesses, the lazy
|
|
// table init behaves like a higher-level cache layer that
|
|
// protects this initialization from happening multiple
|
|
// times, via that particular table at least.
|
|
//
|
|
// When `ref.func` becomes more commonly used or if we
|
|
// otherwise see a use-case where this becomes a hotpath,
|
|
// we can reconsider by using some state to track
|
|
// "uninitialized" explicitly, for example by zeroing the
|
|
// anyfuncs (perhaps together with other
|
|
// zeroed-at-instantiate-time state) or using a separate
|
|
// is-initialized bitmap.
|
|
//
|
|
// We arrived at this design because zeroing memory is
|
|
// expensive, so it's better for instantiation performance
|
|
// if we don't have to track "is-initialized" state at
|
|
// all!
|
|
let func = &self.module().functions[index];
|
|
let sig = func.signature;
|
|
let anyfunc: *mut VMCallerCheckedAnyfunc = self
|
|
.vmctx_plus_offset::<VMCallerCheckedAnyfunc>(
|
|
self.offsets.vmctx_anyfunc(func.anyfunc),
|
|
);
|
|
self.construct_anyfunc(index, sig, anyfunc);
|
|
|
|
Some(anyfunc)
|
|
}
|
|
}
|
|
|
|
/// The `table.init` operation: initializes a portion of a table with a
|
|
/// passive element.
|
|
///
|
|
/// # Errors
|
|
///
|
|
/// Returns a `Trap` error when the range within the table is out of bounds
|
|
/// or the range within the passive element is out of bounds.
|
|
pub(crate) fn table_init(
|
|
&mut self,
|
|
table_index: TableIndex,
|
|
elem_index: ElemIndex,
|
|
dst: u32,
|
|
src: u32,
|
|
len: u32,
|
|
) -> Result<(), TrapCode> {
|
|
// TODO: this `clone()` shouldn't be necessary but is used for now to
|
|
// inform `rustc` that the lifetime of the elements here are
|
|
// disconnected from the lifetime of `self`.
|
|
let module = self.module().clone();
|
|
|
|
let elements = match module.passive_elements_map.get(&elem_index) {
|
|
Some(index) if !self.dropped_elements.contains(elem_index) => {
|
|
module.passive_elements[*index].as_ref()
|
|
}
|
|
_ => &[],
|
|
};
|
|
self.table_init_segment(table_index, elements, dst, src, len)
|
|
}
|
|
|
|
pub(crate) fn table_init_segment(
|
|
&mut self,
|
|
table_index: TableIndex,
|
|
elements: &[FuncIndex],
|
|
dst: u32,
|
|
src: u32,
|
|
len: u32,
|
|
) -> Result<(), TrapCode> {
|
|
// https://webassembly.github.io/bulk-memory-operations/core/exec/instructions.html#exec-table-init
|
|
|
|
let table = unsafe { &mut *self.get_table(table_index) };
|
|
|
|
let elements = match elements
|
|
.get(usize::try_from(src).unwrap()..)
|
|
.and_then(|s| s.get(..usize::try_from(len).unwrap()))
|
|
{
|
|
Some(elements) => elements,
|
|
None => return Err(TrapCode::TableOutOfBounds),
|
|
};
|
|
|
|
match table.element_type() {
|
|
TableElementType::Func => {
|
|
table.init_funcs(
|
|
dst,
|
|
elements.iter().map(|idx| {
|
|
self.get_caller_checked_anyfunc(*idx)
|
|
.unwrap_or(std::ptr::null_mut())
|
|
}),
|
|
)?;
|
|
}
|
|
|
|
TableElementType::Extern => {
|
|
debug_assert!(elements.iter().all(|e| *e == FuncIndex::reserved_value()));
|
|
table.fill(dst, TableElement::ExternRef(None), len)?;
|
|
}
|
|
}
|
|
Ok(())
|
|
}
|
|
|
|
/// Drop an element.
|
|
pub(crate) fn elem_drop(&mut self, elem_index: ElemIndex) {
|
|
// https://webassembly.github.io/reference-types/core/exec/instructions.html#exec-elem-drop
|
|
|
|
self.dropped_elements.insert(elem_index);
|
|
|
|
// Note that we don't check that we actually removed a segment because
|
|
// dropping a non-passive segment is a no-op (not a trap).
|
|
}
|
|
|
|
/// Get a locally-defined memory.
|
|
pub(crate) fn get_defined_memory(&mut self, index: DefinedMemoryIndex) -> *mut Memory {
|
|
ptr::addr_of_mut!(self.memories[index])
|
|
}
|
|
|
|
/// Do a `memory.copy`
|
|
///
|
|
/// # Errors
|
|
///
|
|
/// Returns a `Trap` error when the source or destination ranges are out of
|
|
/// bounds.
|
|
pub(crate) fn memory_copy(
|
|
&mut self,
|
|
dst_index: MemoryIndex,
|
|
dst: u64,
|
|
src_index: MemoryIndex,
|
|
src: u64,
|
|
len: u64,
|
|
) -> Result<(), TrapCode> {
|
|
// https://webassembly.github.io/reference-types/core/exec/instructions.html#exec-memory-copy
|
|
|
|
let src_mem = self.get_memory(src_index);
|
|
let dst_mem = self.get_memory(dst_index);
|
|
|
|
let src = self.validate_inbounds(src_mem.current_length(), src, len)?;
|
|
let dst = self.validate_inbounds(dst_mem.current_length(), dst, len)?;
|
|
|
|
// Bounds and casts are checked above, by this point we know that
|
|
// everything is safe.
|
|
unsafe {
|
|
let dst = dst_mem.base.add(dst);
|
|
let src = src_mem.base.add(src);
|
|
// FIXME audit whether this is safe in the presence of shared memory
|
|
// (https://github.com/bytecodealliance/wasmtime/issues/4203).
|
|
ptr::copy(src, dst, len as usize);
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
|
|
fn validate_inbounds(&self, max: usize, ptr: u64, len: u64) -> Result<usize, TrapCode> {
|
|
let oob = || TrapCode::HeapOutOfBounds;
|
|
let end = ptr
|
|
.checked_add(len)
|
|
.and_then(|i| usize::try_from(i).ok())
|
|
.ok_or_else(oob)?;
|
|
if end > max {
|
|
Err(oob())
|
|
} else {
|
|
Ok(ptr as usize)
|
|
}
|
|
}
|
|
|
|
/// Perform the `memory.fill` operation on a locally defined memory.
|
|
///
|
|
/// # Errors
|
|
///
|
|
/// Returns a `Trap` error if the memory range is out of bounds.
|
|
pub(crate) fn memory_fill(
|
|
&mut self,
|
|
memory_index: MemoryIndex,
|
|
dst: u64,
|
|
val: u8,
|
|
len: u64,
|
|
) -> Result<(), TrapCode> {
|
|
let memory = self.get_memory(memory_index);
|
|
let dst = self.validate_inbounds(memory.current_length(), dst, len)?;
|
|
|
|
// Bounds and casts are checked above, by this point we know that
|
|
// everything is safe.
|
|
unsafe {
|
|
let dst = memory.base.add(dst);
|
|
// FIXME audit whether this is safe in the presence of shared memory
|
|
// (https://github.com/bytecodealliance/wasmtime/issues/4203).
|
|
ptr::write_bytes(dst, val, len as usize);
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
|
|
/// Performs the `memory.init` operation.
|
|
///
|
|
/// # Errors
|
|
///
|
|
/// Returns a `Trap` error if the destination range is out of this module's
|
|
/// memory's bounds or if the source range is outside the data segment's
|
|
/// bounds.
|
|
pub(crate) fn memory_init(
|
|
&mut self,
|
|
memory_index: MemoryIndex,
|
|
data_index: DataIndex,
|
|
dst: u64,
|
|
src: u32,
|
|
len: u32,
|
|
) -> Result<(), TrapCode> {
|
|
let range = match self.module().passive_data_map.get(&data_index).cloned() {
|
|
Some(range) if !self.dropped_data.contains(data_index) => range,
|
|
_ => 0..0,
|
|
};
|
|
self.memory_init_segment(memory_index, range, dst, src, len)
|
|
}
|
|
|
|
pub(crate) fn wasm_data(&self, range: Range<u32>) -> &[u8] {
|
|
&self.runtime_info.wasm_data()[range.start as usize..range.end as usize]
|
|
}
|
|
|
|
pub(crate) fn memory_init_segment(
|
|
&mut self,
|
|
memory_index: MemoryIndex,
|
|
range: Range<u32>,
|
|
dst: u64,
|
|
src: u32,
|
|
len: u32,
|
|
) -> Result<(), TrapCode> {
|
|
// https://webassembly.github.io/bulk-memory-operations/core/exec/instructions.html#exec-memory-init
|
|
|
|
let memory = self.get_memory(memory_index);
|
|
let data = self.wasm_data(range);
|
|
let dst = self.validate_inbounds(memory.current_length(), dst, len.into())?;
|
|
let src = self.validate_inbounds(data.len(), src.into(), len.into())?;
|
|
let len = len as usize;
|
|
|
|
unsafe {
|
|
let src_start = data.as_ptr().add(src);
|
|
let dst_start = memory.base.add(dst);
|
|
// FIXME audit whether this is safe in the presence of shared memory
|
|
// (https://github.com/bytecodealliance/wasmtime/issues/4203).
|
|
ptr::copy_nonoverlapping(src_start, dst_start, len);
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
|
|
/// Drop the given data segment, truncating its length to zero.
|
|
pub(crate) fn data_drop(&mut self, data_index: DataIndex) {
|
|
self.dropped_data.insert(data_index);
|
|
|
|
// Note that we don't check that we actually removed a segment because
|
|
// dropping a non-passive segment is a no-op (not a trap).
|
|
}
|
|
|
|
/// Get a table by index regardless of whether it is locally-defined
|
|
/// or an imported, foreign table. Ensure that the given range of
|
|
/// elements in the table is lazily initialized. We define this
|
|
/// operation all-in-one for safety, to ensure the lazy-init
|
|
/// happens.
|
|
///
|
|
/// Takes an `Iterator` for the index-range to lazy-initialize,
|
|
/// for flexibility. This can be a range, single item, or empty
|
|
/// sequence, for example. The iterator should return indices in
|
|
/// increasing order, so that the break-at-out-of-bounds behavior
|
|
/// works correctly.
|
|
pub(crate) fn get_table_with_lazy_init(
|
|
&mut self,
|
|
table_index: TableIndex,
|
|
range: impl Iterator<Item = u32>,
|
|
) -> *mut Table {
|
|
let (idx, instance) = self.get_defined_table_index_and_instance(table_index);
|
|
let elt_ty = instance.tables[idx].element_type();
|
|
|
|
if elt_ty == TableElementType::Func {
|
|
for i in range {
|
|
let value = match instance.tables[idx].get(i) {
|
|
Some(value) => value,
|
|
None => {
|
|
// Out-of-bounds; caller will handle by likely
|
|
// throwing a trap. No work to do to lazy-init
|
|
// beyond the end.
|
|
break;
|
|
}
|
|
};
|
|
if value.is_uninit() {
|
|
let table_init = match &instance.module().table_initialization {
|
|
// We unfortunately can't borrow `tables`
|
|
// outside the loop because we need to call
|
|
// `get_caller_checked_anyfunc` (a `&mut`
|
|
// method) below; so unwrap it dynamically
|
|
// here.
|
|
TableInitialization::FuncTable { tables, .. } => tables,
|
|
_ => break,
|
|
}
|
|
.get(table_index);
|
|
|
|
// The TableInitialization::FuncTable elements table may
|
|
// be smaller than the current size of the table: it
|
|
// always matches the initial table size, if present. We
|
|
// want to iterate up through the end of the accessed
|
|
// index range so that we set an "initialized null" even
|
|
// if there is no initializer. We do a checked `get()` on
|
|
// the initializer table below and unwrap to a null if
|
|
// we're past its end.
|
|
let func_index =
|
|
table_init.and_then(|indices| indices.get(i as usize).cloned());
|
|
let anyfunc = func_index
|
|
.and_then(|func_index| instance.get_caller_checked_anyfunc(func_index))
|
|
.unwrap_or(std::ptr::null_mut());
|
|
|
|
let value = TableElement::FuncRef(anyfunc);
|
|
|
|
instance.tables[idx]
|
|
.set(i, value)
|
|
.expect("Table type should match and index should be in-bounds");
|
|
}
|
|
}
|
|
}
|
|
|
|
ptr::addr_of_mut!(instance.tables[idx])
|
|
}
|
|
|
|
/// Get a table by index regardless of whether it is locally-defined or an
|
|
/// imported, foreign table.
|
|
pub(crate) fn get_table(&mut self, table_index: TableIndex) -> *mut Table {
|
|
let (idx, instance) = self.get_defined_table_index_and_instance(table_index);
|
|
ptr::addr_of_mut!(instance.tables[idx])
|
|
}
|
|
|
|
/// Get a locally-defined table.
|
|
pub(crate) fn get_defined_table(&mut self, index: DefinedTableIndex) -> *mut Table {
|
|
ptr::addr_of_mut!(self.tables[index])
|
|
}
|
|
|
|
pub(crate) fn get_defined_table_index_and_instance(
|
|
&mut self,
|
|
index: TableIndex,
|
|
) -> (DefinedTableIndex, &mut Instance) {
|
|
if let Some(defined_table_index) = self.module().defined_table_index(index) {
|
|
(defined_table_index, self)
|
|
} else {
|
|
let import = self.imported_table(index);
|
|
unsafe {
|
|
let foreign_instance = (*import.vmctx).instance_mut();
|
|
let foreign_table_def = &*import.from;
|
|
let foreign_table_index = foreign_instance.table_index(foreign_table_def);
|
|
(foreign_table_index, foreign_instance)
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Initialize the VMContext data associated with this Instance.
|
|
///
|
|
/// The `VMContext` memory is assumed to be uninitialized; any field
|
|
/// that we need in a certain state will be explicitly written by this
|
|
/// function.
|
|
unsafe fn initialize_vmctx(&mut self, module: &Module, store: StorePtr, imports: Imports) {
|
|
assert!(std::ptr::eq(module, self.module().as_ref()));
|
|
|
|
*self.vmctx_plus_offset(self.offsets.vmctx_magic()) = VMCONTEXT_MAGIC;
|
|
self.set_callee(None);
|
|
self.set_store(store.as_raw());
|
|
|
|
// Initialize shared signatures
|
|
let signatures = self.runtime_info.signature_ids();
|
|
*self.vmctx_plus_offset(self.offsets.vmctx_signature_ids_array()) = signatures.as_ptr();
|
|
|
|
// Initialize the built-in functions
|
|
*self.vmctx_plus_offset(self.offsets.vmctx_builtin_functions()) =
|
|
&VMBuiltinFunctionsArray::INIT;
|
|
|
|
// Initialize the imports
|
|
debug_assert_eq!(imports.functions.len(), module.num_imported_funcs);
|
|
ptr::copy_nonoverlapping(
|
|
imports.functions.as_ptr(),
|
|
self.vmctx_plus_offset(self.offsets.vmctx_imported_functions_begin()),
|
|
imports.functions.len(),
|
|
);
|
|
debug_assert_eq!(imports.tables.len(), module.num_imported_tables);
|
|
ptr::copy_nonoverlapping(
|
|
imports.tables.as_ptr(),
|
|
self.vmctx_plus_offset(self.offsets.vmctx_imported_tables_begin()),
|
|
imports.tables.len(),
|
|
);
|
|
debug_assert_eq!(imports.memories.len(), module.num_imported_memories);
|
|
ptr::copy_nonoverlapping(
|
|
imports.memories.as_ptr(),
|
|
self.vmctx_plus_offset(self.offsets.vmctx_imported_memories_begin()),
|
|
imports.memories.len(),
|
|
);
|
|
debug_assert_eq!(imports.globals.len(), module.num_imported_globals);
|
|
ptr::copy_nonoverlapping(
|
|
imports.globals.as_ptr(),
|
|
self.vmctx_plus_offset(self.offsets.vmctx_imported_globals_begin()),
|
|
imports.globals.len(),
|
|
);
|
|
|
|
// N.B.: there is no need to initialize the anyfuncs array because
|
|
// we eagerly construct each element in it whenever asked for a
|
|
// reference to that element. In other words, there is no state
|
|
// needed to track the lazy-init, so we don't need to initialize
|
|
// any state now.
|
|
|
|
// Initialize the defined tables
|
|
let mut ptr = self.vmctx_plus_offset(self.offsets.vmctx_tables_begin());
|
|
for i in 0..module.table_plans.len() - module.num_imported_tables {
|
|
ptr::write(ptr, self.tables[DefinedTableIndex::new(i)].vmtable());
|
|
ptr = ptr.add(1);
|
|
}
|
|
|
|
// Initialize the defined memories. This fills in both the
|
|
// `defined_memories` table and the `owned_memories` table at the same
|
|
// time. Entries in `defined_memories` hold a pointer to a definition
|
|
// (all memories) whereas the `owned_memories` hold the actual
|
|
// definitions of memories owned (not shared) in the module.
|
|
let mut ptr = self.vmctx_plus_offset(self.offsets.vmctx_memories_begin());
|
|
let mut owned_ptr = self.vmctx_plus_offset(self.offsets.vmctx_owned_memories_begin());
|
|
for i in 0..module.memory_plans.len() - module.num_imported_memories {
|
|
let defined_memory_index = DefinedMemoryIndex::new(i);
|
|
let memory_index = module.memory_index(defined_memory_index);
|
|
if module.memory_plans[memory_index].memory.shared {
|
|
let def_ptr = self.memories[defined_memory_index]
|
|
.as_shared_memory()
|
|
.unwrap()
|
|
.vmmemory_ptr_mut();
|
|
ptr::write(ptr, def_ptr);
|
|
} else {
|
|
ptr::write(owned_ptr, self.memories[defined_memory_index].vmmemory());
|
|
ptr::write(ptr, owned_ptr);
|
|
owned_ptr = owned_ptr.add(1);
|
|
}
|
|
ptr = ptr.add(1);
|
|
}
|
|
|
|
// Initialize the defined globals
|
|
self.initialize_vmctx_globals(module);
|
|
}
|
|
|
|
unsafe fn initialize_vmctx_globals(&mut self, module: &Module) {
|
|
let num_imports = module.num_imported_globals;
|
|
for (index, global) in module.globals.iter().skip(num_imports) {
|
|
let def_index = module.defined_global_index(index).unwrap();
|
|
let to = self.global_ptr(def_index);
|
|
|
|
// Initialize the global before writing to it
|
|
ptr::write(to, VMGlobalDefinition::new());
|
|
|
|
match global.initializer {
|
|
GlobalInit::I32Const(x) => *(*to).as_i32_mut() = x,
|
|
GlobalInit::I64Const(x) => *(*to).as_i64_mut() = x,
|
|
GlobalInit::F32Const(x) => *(*to).as_f32_bits_mut() = x,
|
|
GlobalInit::F64Const(x) => *(*to).as_f64_bits_mut() = x,
|
|
GlobalInit::V128Const(x) => *(*to).as_u128_mut() = x,
|
|
GlobalInit::GetGlobal(x) => {
|
|
let from = if let Some(def_x) = module.defined_global_index(x) {
|
|
self.global(def_x)
|
|
} else {
|
|
&*self.imported_global(x).from
|
|
};
|
|
// Globals of type `externref` need to manage the reference
|
|
// count as values move between globals, everything else is just
|
|
// copy-able bits.
|
|
match global.wasm_ty {
|
|
WasmType::ExternRef => {
|
|
*(*to).as_externref_mut() = from.as_externref().clone()
|
|
}
|
|
_ => ptr::copy_nonoverlapping(from, to, 1),
|
|
}
|
|
}
|
|
GlobalInit::RefFunc(f) => {
|
|
*(*to).as_anyfunc_mut() = self.get_caller_checked_anyfunc(f).unwrap()
|
|
as *const VMCallerCheckedAnyfunc;
|
|
}
|
|
GlobalInit::RefNullConst => match global.wasm_ty {
|
|
// `VMGlobalDefinition::new()` already zeroed out the bits
|
|
WasmType::FuncRef => {}
|
|
WasmType::ExternRef => {}
|
|
ty => panic!("unsupported reference type for global: {:?}", ty),
|
|
},
|
|
GlobalInit::Import => panic!("locally-defined global initialized as import"),
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
impl Drop for Instance {
|
|
fn drop(&mut self) {
|
|
// Drop any defined globals
|
|
for (idx, global) in self.module().globals.iter() {
|
|
let idx = match self.module().defined_global_index(idx) {
|
|
Some(idx) => idx,
|
|
None => continue,
|
|
};
|
|
match global.wasm_ty {
|
|
// For now only externref globals need to get destroyed
|
|
WasmType::ExternRef => {}
|
|
_ => continue,
|
|
}
|
|
unsafe {
|
|
drop((*self.global_ptr(idx)).as_externref_mut().take());
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/// A handle holding an `Instance` of a WebAssembly module.
|
|
#[derive(Hash, PartialEq, Eq)]
|
|
pub struct InstanceHandle {
|
|
instance: *mut Instance,
|
|
}
|
|
|
|
// These are only valid if the `Instance` type is send/sync, hence the
|
|
// assertion below.
|
|
unsafe impl Send for InstanceHandle {}
|
|
unsafe impl Sync for InstanceHandle {}
|
|
|
|
fn _assert_send_sync() {
|
|
fn _assert<T: Send + Sync>() {}
|
|
_assert::<Instance>();
|
|
}
|
|
|
|
impl InstanceHandle {
|
|
/// Create a new `InstanceHandle` pointing at the instance
|
|
/// pointed to by the given `VMContext` pointer.
|
|
///
|
|
/// # Safety
|
|
/// This is unsafe because it doesn't work on just any `VMContext`, it must
|
|
/// be a `VMContext` allocated as part of an `Instance`.
|
|
#[inline]
|
|
pub unsafe fn from_vmctx(vmctx: *mut VMContext) -> Self {
|
|
let instance = (&mut *vmctx).instance();
|
|
Self {
|
|
instance: instance as *const Instance as *mut Instance,
|
|
}
|
|
}
|
|
|
|
/// Return a reference to the vmctx used by compiled wasm code.
|
|
pub fn vmctx(&self) -> &VMContext {
|
|
self.instance().vmctx()
|
|
}
|
|
|
|
/// Return a raw pointer to the vmctx used by compiled wasm code.
|
|
#[inline]
|
|
pub fn vmctx_ptr(&self) -> *mut VMContext {
|
|
self.instance().vmctx_ptr()
|
|
}
|
|
|
|
/// Return a reference to a module.
|
|
pub fn module(&self) -> &Arc<Module> {
|
|
self.instance().module()
|
|
}
|
|
|
|
/// Lookup a function by index.
|
|
pub fn get_exported_func(&mut self, export: FuncIndex) -> ExportFunction {
|
|
self.instance_mut().get_exported_func(export)
|
|
}
|
|
|
|
/// Lookup a global by index.
|
|
pub fn get_exported_global(&mut self, export: GlobalIndex) -> ExportGlobal {
|
|
self.instance_mut().get_exported_global(export)
|
|
}
|
|
|
|
/// Lookup a memory by index.
|
|
pub fn get_exported_memory(&mut self, export: MemoryIndex) -> ExportMemory {
|
|
self.instance_mut().get_exported_memory(export)
|
|
}
|
|
|
|
/// Lookup a table by index.
|
|
pub fn get_exported_table(&mut self, export: TableIndex) -> ExportTable {
|
|
self.instance_mut().get_exported_table(export)
|
|
}
|
|
|
|
/// Lookup an item with the given index.
|
|
pub fn get_export_by_index(&mut self, export: EntityIndex) -> Export {
|
|
match export {
|
|
EntityIndex::Function(i) => Export::Function(self.get_exported_func(i)),
|
|
EntityIndex::Global(i) => Export::Global(self.get_exported_global(i)),
|
|
EntityIndex::Table(i) => Export::Table(self.get_exported_table(i)),
|
|
EntityIndex::Memory(i) => Export::Memory(self.get_exported_memory(i)),
|
|
}
|
|
}
|
|
|
|
/// Return an iterator over the exports of this instance.
|
|
///
|
|
/// Specifically, it provides access to the key-value pairs, where the keys
|
|
/// are export names, and the values are export declarations which can be
|
|
/// resolved `lookup_by_declaration`.
|
|
pub fn exports(&self) -> indexmap::map::Iter<String, EntityIndex> {
|
|
self.instance().exports()
|
|
}
|
|
|
|
/// Return a reference to the custom state attached to this instance.
|
|
pub fn host_state(&self) -> &dyn Any {
|
|
self.instance().host_state()
|
|
}
|
|
|
|
/// Get a memory defined locally within this module.
|
|
pub fn get_defined_memory(&mut self, index: DefinedMemoryIndex) -> *mut Memory {
|
|
self.instance_mut().get_defined_memory(index)
|
|
}
|
|
|
|
/// Return the table index for the given `VMTableDefinition` in this instance.
|
|
pub unsafe fn table_index(&self, table: &VMTableDefinition) -> DefinedTableIndex {
|
|
self.instance().table_index(table)
|
|
}
|
|
|
|
/// Get a table defined locally within this module.
|
|
pub fn get_defined_table(&mut self, index: DefinedTableIndex) -> *mut Table {
|
|
self.instance_mut().get_defined_table(index)
|
|
}
|
|
|
|
/// Get a table defined locally within this module, lazily
|
|
/// initializing the given range first.
|
|
pub fn get_defined_table_with_lazy_init(
|
|
&mut self,
|
|
index: DefinedTableIndex,
|
|
range: impl Iterator<Item = u32>,
|
|
) -> *mut Table {
|
|
let index = self.instance().module().table_index(index);
|
|
self.instance_mut().get_table_with_lazy_init(index, range)
|
|
}
|
|
|
|
/// Return a reference to the contained `Instance`.
|
|
#[inline]
|
|
pub(crate) fn instance(&self) -> &Instance {
|
|
unsafe { &*(self.instance as *const Instance) }
|
|
}
|
|
|
|
pub(crate) fn instance_mut(&mut self) -> &mut Instance {
|
|
unsafe { &mut *self.instance }
|
|
}
|
|
|
|
/// Returns the `Store` pointer that was stored on creation
|
|
#[inline]
|
|
pub fn store(&self) -> *mut dyn Store {
|
|
self.instance().store()
|
|
}
|
|
|
|
/// Configure the `*mut dyn Store` internal pointer after-the-fact.
|
|
///
|
|
/// This is provided for the original `Store` itself to configure the first
|
|
/// self-pointer after the original `Box` has been initialized.
|
|
pub unsafe fn set_store(&mut self, store: *mut dyn Store) {
|
|
self.instance_mut().set_store(Some(store));
|
|
}
|
|
|
|
/// Returns a clone of this instance.
|
|
///
|
|
/// This is unsafe because the returned handle here is just a cheap clone
|
|
/// of the internals, there's no lifetime tracking around its validity.
|
|
/// You'll need to ensure that the returned handles all go out of scope at
|
|
/// the same time.
|
|
#[inline]
|
|
pub unsafe fn clone(&self) -> InstanceHandle {
|
|
InstanceHandle {
|
|
instance: self.instance,
|
|
}
|
|
}
|
|
}
|