Add a pooling allocator mode based on copy-on-write mappings of memfds.

As first suggested by Jan on the Zulip here [1], a cheap and effective
way to obtain copy-on-write semantics of a "backing image" for a Wasm
memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism
provided by the Linux kernel allows us to create anonymous,
in-memory-only files that we can use for this mapping, so we can
construct the image contents on-the-fly then effectively create a CoW
overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)`
will discard the CoW overlay, returning the mapping to its original
state.

By itself this is almost enough for a very fast
instantiation-termination loop of the same image over and over,
without changing the address space mapping at all (which is
expensive). The only missing bit is how to implement
heap *growth*. But here memfds can help us again: if we create another
anonymous file and map it where the extended parts of the heap would
go, we can take advantage of the fact that a `mmap()` mapping can
be *larger than the file itself*, with accesses beyond the end
generating a `SIGBUS`, and the fact that we can cheaply resize the
file with `ftruncate`, even after a mapping exists. So we can map the
"heap extension" file once with the maximum memory-slot size and grow
the memfd itself as `memory.grow` operations occur.

The above CoW technique and heap-growth technique together allow us a
fastpath of `madvise()` and `ftruncate()` only when we re-instantiate
the same module over and over, as long as we can reuse the same
slot. This fastpath avoids all whole-process address-space locks in
the Linux kernel, which should mean it is highly scalable. It also
avoids the cost of copying data on read, as the `uffd` heap backend
does when servicing pagefaults; the kernel's own optimized CoW
logic (same as used by all file mmaps) is used instead.

[1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772
This commit is contained in:
Chris Fallin
2022-01-18 16:42:24 -08:00
parent 90e7cef56c
commit b73ac83c37
26 changed files with 1070 additions and 135 deletions

View File

@@ -19,7 +19,10 @@ use wasmtime_environ::{
StackMapInformation, Trampoline, Tunables, WasmFuncType, ELF_WASMTIME_ADDRMAP,
ELF_WASMTIME_TRAPS,
};
use wasmtime_runtime::{GdbJitImageRegistration, InstantiationError, VMFunctionBody, VMTrampoline};
use wasmtime_runtime::{
CompiledModuleId, CompiledModuleIdAllocator, GdbJitImageRegistration, InstantiationError,
VMFunctionBody, VMTrampoline,
};
/// This is the name of the section in the final ELF image which contains
/// concatenated data segments from the original wasm module.
@@ -248,6 +251,8 @@ pub struct CompiledModule {
code: Range<usize>,
code_memory: CodeMemory,
dbg_jit_registration: Option<GdbJitImageRegistration>,
/// A unique ID used to register this module with the engine.
unique_id: CompiledModuleId,
}
impl CompiledModule {
@@ -271,6 +276,7 @@ impl CompiledModule {
mmap: MmapVec,
info: Option<CompiledModuleInfo>,
profiler: &dyn ProfilingAgent,
id_allocator: &CompiledModuleIdAllocator,
) -> Result<Arc<Self>> {
// Transfer ownership of `obj` to a `CodeMemory` object which will
// manage permissions, such as the executable bit. Once it's located
@@ -312,6 +318,7 @@ impl CompiledModule {
dbg_jit_registration: None,
code_memory,
meta: info.meta,
unique_id: id_allocator.alloc(),
};
ret.register_debug_and_profiling(profiler)?;
@@ -333,6 +340,12 @@ impl CompiledModule {
Ok(())
}
/// Get this module's unique ID. It is unique with respect to a
/// single allocator (which is ordinarily held on a Wasm engine).
pub fn unique_id(&self) -> CompiledModuleId {
self.unique_id
}
/// Returns the underlying memory which contains the compiled module's
/// image.
pub fn mmap(&self) -> &MmapVec {