Use mmap'd *.cwasm as a source for memory initialization images (#3787)
* Skip memfd creation with precompiled modules This commit updates the memfd support internally to not actually use a memfd if a compiled module originally came from disk via the `wasmtime::Module::deserialize_file` API. In this situation we already have a file descriptor open and there's no need to copy a module's heap image to a new file descriptor. To facilitate a new source of `mmap` the currently-memfd-specific-logic of creating a heap image is generalized to a new form of `MemoryInitialization` which is attempted for all modules at module-compile-time. This means that the serialized artifact to disk will have the memory image in its entirety waiting for us. Furthermore the memory image is ensured to be padded and aligned carefully to the target system's page size, notably meaning that the data section in the final object file is page-aligned and the size of the data section is also page aligned. This means that when a precompiled module is mapped from disk we can reuse the underlying `File` to mmap all initial memory images. This means that the offset-within-the-memory-mapped-file can differ for memfd-vs-not, but that's just another piece of state to track in the memfd implementation. In the limit this waters down the term "memfd" for this technique of quickly initializing memory because we no longer use memfd unconditionally (only when the backing file isn't available). This does however open up an avenue in the future to porting this support to other OSes because while `memfd_create` is Linux-specific both macOS and Windows support mapping a file with copy-on-write. This porting isn't done in this PR and is left for a future refactoring. Closes #3758 * Enable "memfd" support on all unix systems Cordon off the Linux-specific bits and enable the memfd support to compile and run on platforms like macOS which have a Linux-like `mmap`. This only works if a module is mapped from a precompiled module file on disk, but that's better than not supporting it at all! * Fix linux compile * Use `Arc<File>` instead of `MmapVecFileBacking` * Use a named struct instead of mysterious tuples * Comment about unsafety in `Module::deserialize_file` * Fix tests * Fix uffd compile * Always align data segments No need to have conditional alignment since their sizes are all aligned anyway * Update comment in build.rs * Use rustix, not `region` * Fix some confusing logic/names around memory indexes These functions all work with memory indexes, not specifically defined memory indexes.
This commit is contained in:
@@ -221,6 +221,12 @@ pub trait Compiler: Send + Sync {
|
||||
/// Returns the target triple that this compiler is compiling for.
|
||||
fn triple(&self) -> &target_lexicon::Triple;
|
||||
|
||||
/// Returns the alignment necessary to align values to the page size of the
|
||||
/// compilation target. Note that this may be an upper-bound where the
|
||||
/// alignment is larger than necessary for some platforms since it may
|
||||
/// depend on the platform's runtime configuration.
|
||||
fn page_size_align(&self) -> u64;
|
||||
|
||||
/// Returns a list of configured settings for this compiler.
|
||||
fn flags(&self) -> BTreeMap<String, FlagValue>;
|
||||
|
||||
|
||||
@@ -114,6 +114,18 @@ pub struct MemoryInitializer {
|
||||
pub data: Range<u32>,
|
||||
}
|
||||
|
||||
/// Similar to the above `MemoryInitializer` but only used when memory
|
||||
/// initializers are statically known to be valid.
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub struct StaticMemoryInitializer {
|
||||
/// The 64-bit offset, in bytes, of where this initializer starts.
|
||||
pub offset: u64,
|
||||
|
||||
/// The range of data to write at `offset`, where these indices are indexes
|
||||
/// into the compiled wasm module's data section.
|
||||
pub data: Range<u32>,
|
||||
}
|
||||
|
||||
/// The type of WebAssembly linear memory initialization to use for a module.
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub enum MemoryInitialization {
|
||||
@@ -159,7 +171,42 @@ pub enum MemoryInitialization {
|
||||
/// indices, like those in `MemoryInitializer`, point within a data
|
||||
/// segment that will come as an auxiliary descriptor with other data
|
||||
/// such as the compiled code for the wasm module.
|
||||
map: PrimaryMap<MemoryIndex, Vec<(u64, Range<u32>)>>,
|
||||
map: PrimaryMap<MemoryIndex, Vec<StaticMemoryInitializer>>,
|
||||
},
|
||||
|
||||
/// Memory initialization is statically known and involves a single `memcpy`
|
||||
/// or otherwise simply making the defined data visible.
|
||||
///
|
||||
/// To be statically initialized the same requirements as `Paged` must be
|
||||
/// met, namely that everything references a dfeined memory and all data
|
||||
/// segments have a staitcally known in-bounds base (no globals).
|
||||
///
|
||||
/// This form of memory initialization is a more optimized version of
|
||||
/// `Segmented` where memory can be initialized with one of a few methods:
|
||||
///
|
||||
/// * First it could be initialized with a single `memcpy` of data from the
|
||||
/// module to the linear memory.
|
||||
/// * Otherwise techniques like `mmap` are also possible to make this data,
|
||||
/// which might reside in a compiled module on disk, available immediately
|
||||
/// in a linear memory's address space.
|
||||
///
|
||||
/// To facilitate the latter fo these techniques the `try_static_init`
|
||||
/// function below, which creates this variant, takes a host page size
|
||||
/// argument which can page-align everything to make mmap-ing possible.
|
||||
Static {
|
||||
/// The initialization contents for each linear memory.
|
||||
///
|
||||
/// This array has, for each module's own linear memory, the contents
|
||||
/// necessary to initialize it. If the memory has a `None` value then no
|
||||
/// initialization is necessary (it's zero-filled). Otherwise with
|
||||
/// `Some` the first element of the tuple is the offset in memory to
|
||||
/// start the initialization and the `Range` is the range within the
|
||||
/// final data section of the compiled module of bytes to copy into the
|
||||
/// memory.
|
||||
///
|
||||
/// The offset, range base, and range end are all guaranteed to be page
|
||||
/// aligned to the page size passed in to `try_static_init`.
|
||||
map: PrimaryMap<MemoryIndex, Option<StaticMemoryInitializer>>,
|
||||
},
|
||||
}
|
||||
|
||||
@@ -192,9 +239,9 @@ impl ModuleTranslation<'_> {
|
||||
let mut data = self.data.iter();
|
||||
let ok = self.module.memory_initialization.init_memory(
|
||||
InitMemory::CompileTime(&self.module),
|
||||
&mut |memory, offset, data_range| {
|
||||
&mut |memory, init| {
|
||||
let data = data.next().unwrap();
|
||||
assert_eq!(data.len(), data_range.len());
|
||||
assert_eq!(data.len(), init.data.len());
|
||||
// If an initializer references an imported memory then
|
||||
// everything will need to be processed in-order anyway to
|
||||
// handle the dynamic limits of the memory specified.
|
||||
@@ -203,8 +250,8 @@ impl ModuleTranslation<'_> {
|
||||
};
|
||||
let page_size = u64::from(WASM_PAGE_SIZE);
|
||||
let contents = &mut page_contents[memory];
|
||||
let mut page_index = offset / page_size;
|
||||
let mut page_offset = (offset % page_size) as usize;
|
||||
let mut page_index = init.offset / page_size;
|
||||
let mut page_offset = (init.offset % page_size) as usize;
|
||||
let mut data = &data[..];
|
||||
|
||||
while !data.is_empty() {
|
||||
@@ -249,9 +296,12 @@ impl ModuleTranslation<'_> {
|
||||
let mut offset = 0;
|
||||
for (memory, pages) in page_contents {
|
||||
let mut page_offsets = Vec::with_capacity(pages.len());
|
||||
for (byte_offset, page) in pages {
|
||||
for (page_index, page) in pages {
|
||||
let end = offset + (page.len() as u32);
|
||||
page_offsets.push((byte_offset, offset..end));
|
||||
page_offsets.push(StaticMemoryInitializer {
|
||||
offset: page_index * u64::from(WASM_PAGE_SIZE),
|
||||
data: offset..end,
|
||||
});
|
||||
offset = end;
|
||||
self.data.push(page.into());
|
||||
}
|
||||
@@ -261,6 +311,155 @@ impl ModuleTranslation<'_> {
|
||||
self.module.memory_initialization = MemoryInitialization::Paged { map };
|
||||
}
|
||||
|
||||
/// Similar to the `try_paged_init` method, but attempts to use the
|
||||
/// `MemoryInitialization::Static` variant.
|
||||
///
|
||||
/// Note that the constraints for `Paged` are the same as those for
|
||||
/// `Static`.
|
||||
pub fn try_static_init(&mut self, page_size: u64) {
|
||||
const MAX_IMAGE_SIZE: usize = 1024 * 1024 * 1024; // limit to 1GiB.
|
||||
|
||||
// This method only attempts to transform a a `Segmented` memory init
|
||||
// into a `Static` one, no other state.
|
||||
if !self.module.memory_initialization.is_segmented() {
|
||||
return;
|
||||
}
|
||||
let page_align = |x: u64| x & !(page_size - 1);
|
||||
let page_align_up = |x: u64| page_align(x + page_size - 1);
|
||||
|
||||
// First build up an in-memory image for each memory. This in-memory
|
||||
// representation is discarded if the memory initializers aren't "of
|
||||
// the right shape" where the desired shape is:
|
||||
//
|
||||
// * Only initializers for defined memories.
|
||||
// * Only initializers with static offsets (no globals).
|
||||
// * Only in-bound initializers.
|
||||
//
|
||||
// The `init_memory` method of `MemoryInitialization` is used here to
|
||||
// do most of the validation for us, and otherwise the data chunks are
|
||||
// collected into the `images` array here.
|
||||
let mut images: PrimaryMap<MemoryIndex, Vec<u8>> =
|
||||
PrimaryMap::with_capacity(self.module.memory_plans.len());
|
||||
for _ in 0..self.module.memory_plans.len() {
|
||||
images.push(Vec::new());
|
||||
}
|
||||
let mut data = self.data.iter();
|
||||
let ok = self.module.memory_initialization.init_memory(
|
||||
InitMemory::CompileTime(&self.module),
|
||||
&mut |memory, init| {
|
||||
let data = data.next().unwrap();
|
||||
assert_eq!(data.len(), init.data.len());
|
||||
|
||||
// Static initialization with only one memcpy is only possible
|
||||
// for defined memories which have a known-starting-as-zero
|
||||
// state to account for holes between data segments. This means
|
||||
// that if this is an imported memory then static memory
|
||||
// initialization isn't possible.
|
||||
if self.module.defined_memory_index(memory).is_none() {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Splat the `data_range` into the `image` for this memory,
|
||||
// updating it as necessary with 0s for holes and such.
|
||||
let image = &mut images[memory];
|
||||
let offset = usize::try_from(init.offset).unwrap();
|
||||
let new_image_len = offset + data.len();
|
||||
if image.len() < new_image_len {
|
||||
if new_image_len > MAX_IMAGE_SIZE {
|
||||
return false;
|
||||
}
|
||||
image.resize(new_image_len, 0);
|
||||
}
|
||||
image[offset..][..data.len()].copy_from_slice(data);
|
||||
true
|
||||
},
|
||||
);
|
||||
|
||||
// If any initializer wasn't applicable then we skip static init
|
||||
// entirely.
|
||||
if !ok {
|
||||
return;
|
||||
}
|
||||
|
||||
// At this point all memories in this module are initialized with
|
||||
// in-bounds initializers which are all statically known to be valid.
|
||||
// This means that the memory `images` built up so far are valid for an
|
||||
// instance of this linear memory. These images are trimmed of their
|
||||
// leading and trailing zeros and then `self.data` is re-populated with
|
||||
// new data.
|
||||
self.data.clear();
|
||||
assert!(self.data_align.is_none());
|
||||
self.data_align = Some(page_size);
|
||||
let mut map = PrimaryMap::with_capacity(images.len());
|
||||
let mut offset = 0u32;
|
||||
for (memory, mut image) in images {
|
||||
// Find the first nonzero byte, and if all the bytes are zero then
|
||||
// we can skip initialization of this memory entirely since memories
|
||||
// otherwise start with all zero bytes.
|
||||
let nonzero_start = match image.iter().position(|b| *b != 0) {
|
||||
Some(i) => i as u64,
|
||||
None => {
|
||||
map.push(None);
|
||||
continue;
|
||||
}
|
||||
};
|
||||
|
||||
// Find the last nonzero byte, which must exist at this point since
|
||||
// we found one going forward. Add one to find the index of the
|
||||
// last zero, which may also be the length of the image.
|
||||
let nonzero_end = image.iter().rposition(|b| *b != 0).unwrap() as u64 + 1;
|
||||
|
||||
// The offset and length of this image are now page-aligned. This
|
||||
// isn't strictly required for a runtime-compiled module which is
|
||||
// never persisted to disk. The purpose of doing this, however is to
|
||||
// enable the in-memory image, if persisted to disk, possible to
|
||||
// mmap into an address space at a future date to enable efficient
|
||||
// initialization of memory. Mapping a file into the address space
|
||||
// requires a page-aligned offset in the file as well as a
|
||||
// page-aligned length, so the offset/length are aligned here.
|
||||
//
|
||||
// Note that in the future if we can distinguish between modules
|
||||
// only ever used at runtime and those persisted to disk and used
|
||||
// later then it's possible to pass a page size parameter to this
|
||||
// function as "1" which means we won't pad the data with extra
|
||||
// zeros or anything like that.
|
||||
let image_offset = page_align(nonzero_start);
|
||||
let image_len = page_align_up(nonzero_end - image_offset);
|
||||
|
||||
assert_eq!(image_offset % page_size, 0);
|
||||
assert_eq!(image_len % page_size, 0);
|
||||
|
||||
// Drop the leading zero bytes on the image, and then also set the
|
||||
// length of the image to the specified length. Note that this may
|
||||
// truncate trailing zeros from the image or it may extend the image
|
||||
// to be page-aligned with zeros.
|
||||
image.drain(..image_offset as usize);
|
||||
image.resize(image_len as usize, 0);
|
||||
self.data.push(image.into());
|
||||
|
||||
// Record how this image is initialized, where `image_offset` is the
|
||||
// offset from the start of linear memory and the length is added to
|
||||
// the `offset` variable outside of this loop which keeps track of
|
||||
// the current offset in the data section. This is used to build the
|
||||
// range `offset..end` which is the range, in the concatenation of
|
||||
// all memory images of `self.data`, of what memory is located at
|
||||
// `image_offset`.
|
||||
let end = offset
|
||||
.checked_add(u32::try_from(image_len).unwrap())
|
||||
.unwrap();
|
||||
let idx = map.push(Some(StaticMemoryInitializer {
|
||||
offset: image_offset,
|
||||
data: offset..end,
|
||||
}));
|
||||
assert_eq!(idx, memory);
|
||||
|
||||
offset = end;
|
||||
assert_eq!(offset % (page_size as u32), 0);
|
||||
}
|
||||
|
||||
self.module.memory_initialization = MemoryInitialization::Static { map };
|
||||
}
|
||||
|
||||
/// Attempts to convert the module's table initializers to
|
||||
/// FuncTable form where possible. This enables lazy table
|
||||
/// initialization later by providing a one-to-one map of initial
|
||||
@@ -399,7 +598,7 @@ impl MemoryInitialization {
|
||||
pub fn init_memory(
|
||||
&self,
|
||||
state: InitMemory<'_>,
|
||||
write: &mut dyn FnMut(MemoryIndex, u64, &Range<u32>) -> bool,
|
||||
write: &mut dyn FnMut(MemoryIndex, &StaticMemoryInitializer) -> bool,
|
||||
) -> bool {
|
||||
let initializers = match self {
|
||||
// Fall through below to the segmented memory one-by-one
|
||||
@@ -413,9 +612,23 @@ impl MemoryInitialization {
|
||||
// indices are in-bounds.
|
||||
MemoryInitialization::Paged { map } => {
|
||||
for (index, pages) in map {
|
||||
for (page_index, page) in pages {
|
||||
debug_assert_eq!(page.end - page.start, WASM_PAGE_SIZE);
|
||||
let result = write(index, *page_index * u64::from(WASM_PAGE_SIZE), page);
|
||||
for init in pages {
|
||||
debug_assert_eq!(init.data.end - init.data.start, WASM_PAGE_SIZE);
|
||||
let result = write(index, init);
|
||||
if !result {
|
||||
return result;
|
||||
}
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
// Like `Paged` above everything's already been validated so this
|
||||
// can simply forward through the data.
|
||||
MemoryInitialization::Static { map } => {
|
||||
for (index, init) in map {
|
||||
if let Some(init) = init {
|
||||
let result = write(index, init);
|
||||
if !result {
|
||||
return result;
|
||||
}
|
||||
@@ -484,7 +697,11 @@ impl MemoryInitialization {
|
||||
// The limits of the data segment have been validated at this point
|
||||
// so the `write` callback is called with the range of data being
|
||||
// written. Any erroneous result is propagated upwards.
|
||||
let result = write(memory_index, start, data);
|
||||
let init = StaticMemoryInitializer {
|
||||
offset: start,
|
||||
data: data.clone(),
|
||||
};
|
||||
let result = write(memory_index, &init);
|
||||
if !result {
|
||||
return result;
|
||||
}
|
||||
|
||||
@@ -92,6 +92,14 @@ pub struct ModuleTranslation<'data> {
|
||||
/// `MemoryInitializer` type.
|
||||
pub data: Vec<Cow<'data, [u8]>>,
|
||||
|
||||
/// The desired alignment of `data` in the final data section of the object
|
||||
/// file that we'll emit.
|
||||
///
|
||||
/// Note that this is 1 by default but `MemoryInitialization::Static` might
|
||||
/// switch this to a higher alignment to facilitate mmap-ing data from
|
||||
/// an object file into a linear memory.
|
||||
pub data_align: Option<u64>,
|
||||
|
||||
/// Total size of all data pushed onto `data` so far.
|
||||
total_data: u32,
|
||||
|
||||
|
||||
Reference in New Issue
Block a user