Add shared memories (#4187)

* Add shared memories This change adds the ability to use shared memories in Wasmtime when the [threads proposal] is enabled. Shared memories are annotated as `shared` in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are protected from concurrent access during `memory.size` and `memory.grow`. [threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md In order to implement this in Wasmtime, there are two main cases to cover: - a program may simply create a shared memory and possibly export it; this means that Wasmtime itself must be able to create shared memories - a user may create a shared memory externally and pass it in as an import during instantiation; this is the case when the program contains code like `(import "env" "memory" (memory 1 1 shared))`--this case is handled by a new Wasmtime API type--`SharedMemory` Because of the first case, this change allows any of the current memory-creation mechanisms to work as-is. Wasmtime can still create either static or dynamic memories in either on-demand or pooling modes, and any of these memories can be considered shared. When shared, the `Memory` runtime container will lock appropriately during `memory.size` and `memory.grow` operations; since all memories use this container, it is an ideal place for implementing the locking once and once only. The second case is covered by the new `SharedMemory` structure. It uses the same `Mmap` allocation under the hood as non-shared memories, but allows the user to perform the allocation externally to Wasmtime and share the memory across threads (via an `Arc`). The pointer address to the actual memory is carefully wired through and owned by the `SharedMemory` structure itself. This means that there are differing views of where to access the pointer (i.e., `VMMemoryDefinition`): for owned memories (the default), the `VMMemoryDefinition` is stored directly by the `VMContext`; in the `SharedMemory` case, however, this `VMContext` must point to this separate structure. To ensure that the `VMContext` can always point to the correct `VMMemoryDefinition`, this change alters the `VMContext` structure. Since a `SharedMemory` owns its own `VMMemoryDefinition`, the `defined_memories` table in the `VMContext` becomes a sequence of pointers--in the shared memory case, they point to the `VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a new table, `owned_memories`. This change adds an additional indirection (through the `*mut VMMemoryDefinition` pointer) that could add overhead. Using an imported memory as a proxy, we measured a 1-3% overhead of this approach on the `pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will special-case the owned memory access (i.e., load a pointer directly to the `owned_memories` entry) for `memory.size` so that only shared memories (and imported memories, as before) incur the indirection cost. * review: remove thread feature check * review: swap wasmtime-types dependency for existing wasmtime-environ use * review: remove unused VMMemoryUnion * review: reword cross-engine error message * review: improve tests * review: refactor to separate prevent Memory <-> SharedMemory conversion * review: into_shared_memory -> as_shared_memory * review: remove commented out code * review: limit shared min/max to 32 bits * review: skip imported memories * review: imported memories are not owned * review: remove TODO * review: document unsafe send + sync * review: add limiter assertion * review: remove TODO * review: improve tests * review: fix doc test * fix: fixes based on discussion with Alex This changes several key parts: - adds memory indexes to imports and exports - makes `VMMemoryDefinition::current_length` an atomic usize * review: add `Extern::SharedMemory` * review: remove TODO * review: atomically load from VMMemoryDescription in JIT-generated code * review: add test probing the last available memory slot across threads * fix: move assertion to new location due to rebase * fix: doc link * fix: add TODOs to c-api * fix: broken doc link * fix: modify pooling allocator messages in tests * review: make owned_memory_index panic instead of returning an option * review: clarify calculation of num_owned_memories * review: move 'use' to top of file * review: change '*const [u8]' to '*mut [u8]' * review: remove TODO * review: avoid hard-coding memory index * review: remove 'preallocation' parameter from 'Memory::_new' * fix: component model memory length * review: check that shared memory plans are static * review: ignore growth limits for shared memory * review: improve atomic store comment * review: add FIXME for memory growth failure * review: add comment about absence of bounds-checked 'memory.size' * review: make 'current_length()' doc comment more precise * review: more comments related to memory.size non-determinism * review: make 'vmmemory' unreachable for shared memory * review: move code around * review: thread plan through to 'wrap()' * review: disallow shared memory allocation with the pooling allocator
2022-06-08 10:13:40 -07:00
parent ed9db962de
commit 2b52f47b83
27 changed files with 1211 additions and 226 deletions
--- a/crates/runtime/src/instance/allocator.rs
+++ b/crates/runtime/src/instance/allocator.rs
@@ -10,7 +10,6 @@ use std::alloc;
 use std::any::Any;
 use std::convert::TryFrom;
 use std::ptr;
-use std::slice;
 use std::sync::Arc;
 use thiserror::Error;
 use wasmtime_environ::{
@@ -315,7 +314,7 @@ fn check_memory_init_bounds(
            .and_then(|start| start.checked_add(init.data.len()));

        match end {
-            Some(end) if end <= memory.current_length => {
+            Some(end) if end <= memory.current_length() => {
                // Initializer is in bounds
            }
            _ => {
@@ -331,7 +330,7 @@ fn check_memory_init_bounds(

 fn initialize_memories(instance: &mut Instance, module: &Module) -> Result<(), InstantiationError> {
    let memory_size_in_pages =
-        &|memory| (instance.get_memory(memory).current_length as u64) / u64::from(WASM_PAGE_SIZE);
+        &|memory| (instance.get_memory(memory).current_length() as u64) / u64::from(WASM_PAGE_SIZE);

    // Loads the `global` value and returns it as a `u64`, but sign-extends
    // 32-bit globals which can be used as the base for 32-bit memories.
@@ -372,10 +371,15 @@ fn initialize_memories(instance: &mut Instance, module: &Module) -> Result<(), I
                }
            }
            let memory = instance.get_memory(memory_index);
-            let dst_slice =
-                unsafe { slice::from_raw_parts_mut(memory.base, memory.current_length) };
-            let dst = &mut dst_slice[usize::try_from(init.offset).unwrap()..][..init.data.len()];
-            dst.copy_from_slice(instance.wasm_data(init.data.clone()));
+
+            unsafe {
+                let src = instance.wasm_data(init.data.clone());
+                let dst = memory.base.add(usize::try_from(init.offset).unwrap());
+                // FIXME audit whether this is safe in the presence of shared
+                // memory
+                // (https://github.com/bytecodealliance/wasmtime/issues/4203).
+                ptr::copy_nonoverlapping(src.as_ptr(), dst, src.len())
+            }
            true
        },
    );
@@ -513,6 +517,36 @@ impl Default for OnDemandInstanceAllocator {
    }
 }

+/// Allocate an instance containing a single memory.
+///
+/// In order to import a [`Memory`] into a WebAssembly instance, Wasmtime
+/// requires that memory to exist in its own instance. Here we bring to life
+/// such a "Frankenstein" instance with the only purpose of exporting a
+/// [`Memory`].
+pub unsafe fn allocate_single_memory_instance(
+    req: InstanceAllocationRequest,
+    memory: Memory,
+) -> Result<InstanceHandle, InstantiationError> {
+    let mut memories = PrimaryMap::default();
+    memories.push(memory);
+    let tables = PrimaryMap::default();
+    let module = req.runtime_info.module();
+    let offsets = VMOffsets::new(HostPtr, module);
+    let layout = Instance::alloc_layout(&offsets);
+    let instance = alloc::alloc(layout) as *mut Instance;
+    Instance::new_at(instance, layout.size(), offsets, req, memories, tables);
+    Ok(InstanceHandle { instance })
+}
+
+/// Internal implementation of [`InstanceHandle`] deallocation.
+///
+/// See [`InstanceAllocator::deallocate()`] for more details.
+pub unsafe fn deallocate(handle: &InstanceHandle) {
+    let layout = Instance::alloc_layout(&handle.instance().offsets);
+    ptr::drop_in_place(handle.instance);
+    alloc::dealloc(handle.instance.cast(), layout);
+}
+
 unsafe impl InstanceAllocator for OnDemandInstanceAllocator {
    unsafe fn allocate(
        &self,
@@ -542,9 +576,7 @@ unsafe impl InstanceAllocator for OnDemandInstanceAllocator {
    }

    unsafe fn deallocate(&self, handle: &InstanceHandle) {
-        let layout = Instance::alloc_layout(&handle.instance().offsets);
-        ptr::drop_in_place(handle.instance);
-        alloc::dealloc(handle.instance.cast(), layout);
+        deallocate(handle)
    }

    #[cfg(feature = "async")]