Commit Graph

11 Commits

Author SHA1 Message Date
Alex Crichton
8ed79c8f57 memfd: Reduce some syscalls in the on-demand case (#3757)
* memfd: Reduce some syscalls in the on-demand case

This tweaks the internal organization of the `MemFdSlot` to avoid some
syscalls in the default case as well as opportunistically in the pooling
case. The two cases added here are:

* A `MemFdSlot` is now created with a specified initial size. For
  pooling this is 0 but for the on-demand case this can be non-zero.

* When `instantiate` is called with no prior image and the sizes match
  (as will be the case for on-demand allocation) then `mprotect` is
  skipped entirely.

* In the `clear_and_remain-ready` case the `mprotect` is skipped if the
  heap wasn't grown at all.

This should avoid ever using `mprotect` unnecessarily and makes the
ranges we `mprotect` a bit smaller as well.

* Review comments

* Tweak allow to apply to whole crate
2022-02-02 16:09:47 -06:00
Chris Fallin
9880eba2a8 Skip memfd tests when on qemu, due to differing madvise semantics. 2022-02-02 12:25:20 -08:00
Chris Fallin
d7b04f5ced Review comments. 2022-02-02 11:41:31 -08:00
Chris Fallin
0ec45d3ae4 Add additional tests for MemFdSlot. 2022-02-02 11:33:05 -08:00
Chris Fallin
94410a8d4b Review comments. 2022-02-02 10:03:31 -08:00
Chris Fallin
84a8368e88 Fix to the optimization: mprotect(NONE) sometimes needed after skipping the initial mmap. 2022-02-01 16:34:06 -08:00
Chris Fallin
01e6bb81fb Review feedback. 2022-02-01 15:49:44 -08:00
Chris Fallin
ccfa245261 Optimization: only mprotect the *new* bit of heap, not all of it.
(This was not a correctness bug, but is an obvious performance bug...)
2022-01-31 21:25:40 -08:00
Chris Fallin
982df2f2e5 Review feedback. 2022-01-31 16:40:14 -08:00
Chris Fallin
570dee63f3 Use MemFdSlot in the on-demand allocator as well. 2022-01-31 13:59:51 -08:00
Chris Fallin
b73ac83c37 Add a pooling allocator mode based on copy-on-write mappings of memfds.
As first suggested by Jan on the Zulip here [1], a cheap and effective
way to obtain copy-on-write semantics of a "backing image" for a Wasm
memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism
provided by the Linux kernel allows us to create anonymous,
in-memory-only files that we can use for this mapping, so we can
construct the image contents on-the-fly then effectively create a CoW
overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)`
will discard the CoW overlay, returning the mapping to its original
state.

By itself this is almost enough for a very fast
instantiation-termination loop of the same image over and over,
without changing the address space mapping at all (which is
expensive). The only missing bit is how to implement
heap *growth*. But here memfds can help us again: if we create another
anonymous file and map it where the extended parts of the heap would
go, we can take advantage of the fact that a `mmap()` mapping can
be *larger than the file itself*, with accesses beyond the end
generating a `SIGBUS`, and the fact that we can cheaply resize the
file with `ftruncate`, even after a mapping exists. So we can map the
"heap extension" file once with the maximum memory-slot size and grow
the memfd itself as `memory.grow` operations occur.

The above CoW technique and heap-growth technique together allow us a
fastpath of `madvise()` and `ftruncate()` only when we re-instantiate
the same module over and over, as long as we can reuse the same
slot. This fastpath avoids all whole-process address-space locks in
the Linux kernel, which should mean it is highly scalable. It also
avoids the cost of copying data on read, as the `uffd` heap backend
does when servicing pagefaults; the kernel's own optimized CoW
logic (same as used by all file mmaps) is used instead.

[1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772
2022-01-31 12:53:18 -08:00