This commit has a few minor updates and some improvements to the
instantiation benchmark harness:
* A `once_cell::unsync::Lazy` type is now used to guard creation of
modules/engines/etc. This enables running singular benchmarks to be
much faster since the benchmark no longer compiles all other
benchmarks that are filtered out. Unfortunately I couldn't find a way
in criterion to test whether a `BenchmarkId` is filtered out or not so
we rely on the runtime laziness to initialize on the first run for
benchmarks that do so.
* All files located in `benches/instantiation` are now loaded for
benchmarking instead of a hardcoded list. This makes it a bit easier
to throw files into the directory and have them benchmarked instead of
having to recompile when working with new files.
* Finally a module deserialization benchmark was added to measure the
time it takes to deserialize a precompiled module from disk (inspired
by discussion on #3787)
While I was at it I also upped some limits to be able to instantiate
cfallin's `spidermonkey.wasm`.
Currently the "sequential" and "parallel" benchmarks reports somewhat
different timings. For sequential it's time-to-instantiate, but for
parallel it's time-to-instantiate-10k instances. The parallelism in the
parallel benchmark can also theoretically be affected by rayon's
work-stealing. For example if rayon doesn't actually do any work
stealing at all then this ends up being a sequential test again.
Otherwise though it's possible for some threads to finish much earlier
as rayon isn't guaranteed to keep threads busy.
This commit applies a few updates to the benchmark:
* First an `InstancePre<T>` is now used instead of a `Linker<T>` to
front-load type-checking and avoid that on each instantiation (and
this is generally the fastest path to instantiate right now).
* Next the instantiation benchmark is changed to measure one
instantiation-per-iteration to measure per-instance instantiation to
better compare with sequential numbers.
* Finally rayon is removed in favor of manually creating background
threads that infinitely do work until we tell them to stop. These
background threads are guaranteed to be working for the entire time
the benchmark is executing and should theoretically exhibit what the
situation that there's N units of work all happening at once.
I also applied some minor updates here such as having the parallel
instantiation defined conditionally for multiple modules as well as
upping the limits of the pooling allocator to handle a large module
(rustpython.wasm) that I threw at it.
This eagerly evaluates the `format!` and produces a `String` with a heap
allocation, regardless whether `foo` is `Some`/`Ok` or `None`/`Err`. Using
`foo.unwrap_or_else(|| panic!(...))` makes it so that the error message
formatting is only evaluated if `foo` is `None`/`Err`.
Implement Wasmtime's new API as designed by RFC 11. This is quite a large commit which has had lots of discussion externally, so for more information it's best to read the RFC thread and the PR thread.
* wasmtime-wasi: re-exporting this WasiCtxBuilder was shadowing the right one
wasi-common's WasiCtxBuilder is really only useful wasi_cap_std_sync and
wasi_tokio to implement their own Builder on top of.
This re-export of wasi-common's is 1. not useful and 2. shadow's the
re-export of the right one in sync::*.
* wasi-common: eliminate WasiCtxBuilder, make the builder methods on WasiCtx instead
* delete wasi-common::WasiCtxBuilder altogether
just put those methods directly on &mut WasiCtx.
As a bonus, the sync and tokio WasiCtxBuilder::build functions
are no longer fallible!
* bench fixes
* more test fixes
This adds benchmarks around module instantiation using criterion.
Both the default (i.e. on-demand) and pooling allocators are tested
sequentially and in parallel using a thread pool.
Instantiation is tested with an empty module, a module with a single page
linear memory, a larger linear memory with a data initializer, and a "hello
world" Rust WASI program.