Files
wasmtime/benches/thread_eager_init.rs
Alex Crichton 82a31680d6 Use a StoreOpaque during backtraces for metadata (#4325)
Previous to this commit Wasmtime would use the `GlobalModuleRegistry`
when learning information about a trap such as its trap code, the
symbols for each frame, etc. This has a downside though of holding a
global read-write lock for the duration of this operation which hinders
registration of new modules in parallel. In addition there was a fair
amount of internal duplication between this "global module registry" and
the store-local module registry. Finally relying on global state for
information like this gets a bit more brittle over time as it seems best
to scope global queries to precisely what's necessary rather than
holding extra information.

With the refactoring in wasm backtraces done in #4183 it's now possible
to always have a `StoreOpaque` reference when a backtrace is collected
for symbolication and otherwise Trap-identification purposes. This
commit adds a `StoreOpaque` parameter to the `Trap::from_runtime`
constructor and then plumbs that everywhere. Note that while doing this
I changed the internal `traphandlers::lazy_per_thread_init` function to
no longer return a `Result` and instead just `panic!` on Unix if memory
couldn't be allocated for a stack. This removed quite a lot of
error-handling code for a case that's expected to quite rarely happen.
If necessary in the future we can add a fallible initialization point
but this feels like a better default balance for the code here.

With a `StoreOpaque` in use when a trap is being symbolicated that means
we have a `ModuleRegistry` which can be used for queries and such. This
meant that the `GlobalModuleRegistry` state could largely be dismantled
and moved to per-`Store` state (within the `ModuleRegistry`, mostly just
moving methods around).

The final state is that the global rwlock is not exclusively scoped
around insertions/deletions/`is_wasm_trap_pc` which is just a lookup and
atomic add. Otherwise symbolication for a backtrace exclusively uses
store-local state now (as intended).

The original motivation for this commit was that frame information
lookup and pieces were looking to get somewhat complicated with the
addition of components which are a new vector of traps coming out of
Cranelift-generated code. My hope is that by having a `Store` around for
more operations it's easier to plumb all this through.
2022-06-27 15:24:59 -05:00

112 lines
4.0 KiB
Rust

use criterion::{criterion_group, criterion_main, Criterion};
use std::thread;
use std::time::{Duration, Instant};
use wasmtime::*;
fn measure_execution_time(c: &mut Criterion) {
// Baseline performance: a single measurment covers both initializing
// thread local resources and executing the first call.
//
// The other two bench functions should sum to this duration.
c.bench_function("lazy initialization at call", move |b| {
let (engine, module) = test_setup();
b.iter_custom(move |iters| {
(0..iters)
.into_iter()
.map(|_| lazy_thread_instantiate(engine.clone(), module.clone()))
.sum()
})
});
// Using Engine::tls_eager_initialize: measure how long eager
// initialization takes on a new thread.
c.bench_function("eager initialization", move |b| {
let (engine, module) = test_setup();
b.iter_custom(move |iters| {
(0..iters)
.into_iter()
.map(|_| {
let (init, _call) = eager_thread_instantiate(engine.clone(), module.clone());
init
})
.sum()
})
});
// Measure how long the first call takes on a thread after it has been
// eagerly initialized.
c.bench_function("call after eager initialization", move |b| {
let (engine, module) = test_setup();
b.iter_custom(move |iters| {
(0..iters)
.into_iter()
.map(|_| {
let (_init, call) = eager_thread_instantiate(engine.clone(), module.clone());
call
})
.sum()
})
});
}
/// Creating a store and measuring the time to perform a call is the same behavior
/// in both setups.
fn duration_of_call(engine: &Engine, module: &Module) -> Duration {
let mut store = Store::new(engine, ());
let inst = Instance::new(&mut store, module, &[]).expect("instantiate");
let f = inst.get_func(&mut store, "f").expect("get f");
let f = f.typed::<(), (), _>(&store).expect("type f");
let call = Instant::now();
f.call(&mut store, ()).expect("call f");
call.elapsed()
}
/// When wasmtime first runs a function on a thread, it needs to initialize
/// some thread-local resources and install signal handlers. This benchmark
/// spawns a new thread, and returns the duration it took to execute the first
/// function call made on that thread.
fn lazy_thread_instantiate(engine: Engine, module: Module) -> Duration {
thread::spawn(move || duration_of_call(&engine, &module))
.join()
.expect("thread joins")
}
/// This benchmark spawns a new thread, and records the duration to eagerly
/// initializes the thread local resources. It then creates a store and
/// instance, and records the duration it took to execute the first function
/// call.
fn eager_thread_instantiate(engine: Engine, module: Module) -> (Duration, Duration) {
thread::spawn(move || {
let init_start = Instant::now();
Engine::tls_eager_initialize();
let init_duration = init_start.elapsed();
(init_duration, duration_of_call(&engine, &module))
})
.join()
.expect("thread joins")
}
fn test_setup() -> (Engine, Module) {
// We only expect to create one Instance at a time, with a single memory.
let pool_count = 10;
let mut config = Config::new();
config.allocation_strategy(InstanceAllocationStrategy::Pooling {
strategy: PoolingAllocationStrategy::NextAvailable,
instance_limits: InstanceLimits {
count: pool_count,
memory_pages: 1,
..Default::default()
},
});
let engine = Engine::new(&config).unwrap();
// The module has a memory (shouldn't matter) and a single function which is a no-op.
let module = Module::new(&engine, r#"(module (memory 1) (func (export "f")))"#).unwrap();
(engine, module)
}
criterion_group!(benches, measure_execution_time);
criterion_main!(benches);