Commit Graph

348 Commits

Author SHA1 Message Date
Andrew Brown
2b52f47b83 Add shared memories (#4187)
* Add shared memories

This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated as `shared`
in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are
protected from concurrent access during `memory.size` and `memory.grow`.

[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md

In order to implement this in Wasmtime, there are two main cases to
cover:
    - a program may simply create a shared memory and possibly export it;
    this means that Wasmtime itself must be able to create shared
    memories
    - a user may create a shared memory externally and pass it in as an
    import during instantiation; this is the case when the program
    contains code like `(import "env" "memory" (memory 1 1
    shared))`--this case is handled by a new Wasmtime API
    type--`SharedMemory`

Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
`Memory` runtime container will lock appropriately during `memory.size`
and `memory.grow` operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.

The second case is covered by the new `SharedMemory` structure. It uses
the same `Mmap` allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an `Arc`). The pointer address to
the actual memory is carefully wired through and owned by the
`SharedMemory` structure itself. This means that there are differing
views of where to access the pointer (i.e., `VMMemoryDefinition`): for
owned memories (the default), the `VMMemoryDefinition` is stored
directly by the `VMContext`; in the `SharedMemory` case, however, this
`VMContext` must point to this separate structure.

To ensure that the `VMContext` can always point to the correct
`VMMemoryDefinition`, this change alters the `VMContext` structure.
Since a `SharedMemory` owns its own `VMMemoryDefinition`, the
`defined_memories` table in the `VMContext` becomes a sequence of
pointers--in the shared memory case, they point to the
`VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory
case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a
new table, `owned_memories`.

This change adds an additional indirection (through the `*mut
VMMemoryDefinition` pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
`pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the `owned_memories` entry) for `memory.size` so that only
shared memories (and imported memories, as before) incur the indirection
cost.

* review: remove thread feature check

* review: swap wasmtime-types dependency for existing wasmtime-environ use

* review: remove unused VMMemoryUnion

* review: reword cross-engine error message

* review: improve tests

* review: refactor to separate prevent Memory <-> SharedMemory conversion

* review: into_shared_memory -> as_shared_memory

* review: remove commented out code

* review: limit shared min/max to 32 bits

* review: skip imported memories

* review: imported memories are not owned

* review: remove TODO

* review: document unsafe send + sync

* review: add limiter assertion

* review: remove TODO

* review: improve tests

* review: fix doc test

* fix: fixes based on discussion with Alex

This changes several key parts:
 - adds memory indexes to imports and exports
 - makes `VMMemoryDefinition::current_length` an atomic usize

* review: add `Extern::SharedMemory`

* review: remove TODO

* review: atomically load from VMMemoryDescription in JIT-generated code

* review: add test probing the last available memory slot across threads

* fix: move assertion to new location due to rebase

* fix: doc link

* fix: add TODOs to c-api

* fix: broken doc link

* fix: modify pooling allocator messages in tests

* review: make owned_memory_index panic instead of returning an option

* review: clarify calculation of num_owned_memories

* review: move 'use' to top of file

* review: change '*const [u8]' to '*mut [u8]'

* review: remove TODO

* review: avoid hard-coding memory index

* review: remove 'preallocation' parameter from 'Memory::_new'

* fix: component model memory length

* review: check that shared memory plans are static

* review: ignore growth limits for shared memory

* review: improve atomic store comment

* review: add FIXME for memory growth failure

* review: add comment about absence of bounds-checked 'memory.size'

* review: make 'current_length()' doc comment more precise

* review: more comments related to memory.size non-determinism

* review: make 'vmmemory' unreachable for shared memory

* review: move code around

* review: thread plan through to 'wrap()'

* review: disallow shared memory allocation with the pooling allocator
2022-06-08 12:13:40 -05:00
wasmtime-publish
55946704cb Bump Wasmtime to 0.39.0 (#4225)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-06-06 09:12:47 -05:00
Alex Crichton
2af358dd9c Add a VMComponentContext type and create it on instantiation (#4215)
* Add a `VMComponentContext` type and create it on instantiation

This commit fills out the `wasmtime-runtime` crate's support for
`VMComponentContext` and creates it as part of the instantiation
process. This moves a few maps that were temporarily allocated in an
`InstanceData` into the `VMComponentContext` and additionally reads the
canonical options data from there instead.

This type still won't be used in its "full glory" until the lowering of
host functions is completely implemented, however, which will be coming
in a future commit.

* Remove `DerefMut` implementation

* Rebase conflicts
2022-06-03 13:34:50 -05:00
Alex Crichton
3ed6fae7b3 Add trampoline compilation support for lowered imports (#4206)
* Add trampoline compilation support for lowered imports

This commit adds support to the component model implementation for
compiling trampolines suitable for calling host imports. Currently this
is purely just the compilation side of things, modifying the
wasmtime-cranelift crate and additionally filling out a new
`VMComponentOffsets` type (similar to `VMOffsets`). The actual creation
of a `VMComponentContext` is still not performed and will be a
subsequent PR.

Internally though some tests are actually possible with this where we at
least assert that compilation of a component and creation of everything
in-memory doesn't panic or trip any assertions, so some tests are added
here for that as well.

* Fix some test errors
2022-06-03 10:01:42 -05:00
Alex Crichton
9f5f978baa Fix double-counting imports in VMOffsets calculations (#4209)
* Fix double-counting imports in `VMOffsets` calculations

This fixes an oversight in the initial creation of `VMOffsets` for a
module to avoid double-counting imported globals, tables, and memories
for calculating the size of the `VMContext`. Prior to this PR imported
items are accidentally also counted as defined items for sizing
calculations meaning that when a memory is imported but not defined, for
example, the `VMContext` will have a space for an inline
`VMMemoryDefinition` when it doesn't need to.

Auditing where all this relates to it appears that the only issue from
this mistake is that `VMContext` is a bit larger than it would otherwise
need to be. Extra slots are uninitialized memory but nothing in Wasmtime
ever actually accesses the memory either, so it should be harmless to
have extra space here. Nevertheless it seems better to shrink the size
as much as possible to avoid wasting space where we can.

* Fix tests
2022-06-02 13:39:38 -05:00
Alex Crichton
2a4851ad2b Change some VMContext pointers to () pointers (#4190)
* Change some `VMContext` pointers to `()` pointers

This commit is motivated by my work on the component model
implementation for imported functions. Currently all context pointers in
wasm are `*mut VMContext` but with the component model my plan is to
make some pointers instead along the lines of `*mut VMComponentContext`.
In doing this though one worry I have is breaking what has otherwise
been a core invariant of Wasmtime for quite some time, subtly
introducing bugs by accident.

To help assuage my worry I've opted here to erase knowledge of
`*mut VMContext` where possible. Instead where applicable a context
pointer is simply known as `*mut ()` and the embedder doesn't actually
know anything about this context beyond the value of the pointer. This
will help prevent Wasmtime from accidentally ever trying to interpret
this context pointer as an actual `VMContext` when it might instead be a
`VMComponentContext`.

Overall this was a pretty smooth transition. The main change here is
that the `VMTrampoline` (now sporting more docs) has its first argument
changed to `*mut ()`. The second argument, the caller context, is still
configured as `*mut VMContext` though because all functions are always
called from wasm still. Eventually for component-to-component calls I
think we'll probably "fake" the second argument as the same as the first
argument, losing track of the original caller, as an intentional way of
isolating components from each other.

Along the way there are a few host locations which do actually assume
that the first argument is indeed a `VMContext`. These are valid
assumptions that are upheld from a correct implementation, but I opted
to add a "magic" field to `VMContext` to assert this in debug mode. This
new "magic" field is inintialized during normal vmcontext initialization
and it's checked whenever a `VMContext` is reinterpreted as an
`Instance` (but only in debug mode). My hope here is to catch any future
accidental mistakes, if ever.

* Use a VMOpaqueContext wrapper

* Fix typos
2022-06-01 11:00:43 -05:00
Alex Crichton
7d3639522e Capture unresolved backtraces on traps (#4193)
I was running tests recently and was surprised that the `--test all`
test was taking more than a minute to run when I didn't recall it ever
taking more than a minute historically. A bisection pointed out #4183 as
the cause and after re-reviewing I realized I forgot that we capture
unresolved backtraces by default (and don't actually resolve them
anywhere yet but that's a problem for another day) rather than resolved
backtraces. This means that it's intended that we use
`Backtrace::new_unresolved` instead of `Backtrace::new` in the
traphandlers crate.

The reason that tests were running so slowly is that the tests which
deal with deep stacks (e.g. stack overflow) would take forever in
testing as the Rust-based decoding of DWARF information is egregiously
slow in unoptimized mode. I did discover independently that optimizing
these dependencies makes the tests ~6x faster, but that's irrelevant if
we're not symbolicating in the first place.
2022-05-31 09:56:56 -05:00
Pat Hickey
bffce37050 make backtrace collection a Config field rather than a cargo feature (#4183)
* sorta working in runtime

* wasmtime-runtime: get rid of wasm-backtrace feature

* wasmtime: factor to make backtraces recording optional. not configurable yet

* get rid of wasm-backtrace features

* trap tests: now a Trap optionally contains backtrace

* eliminate wasm-backtrace feature

* code review fixes

* ci: no more wasm-backtrace feature

* c_api: backtraces always enabled

* config: unwind required by backtraces and ref types

* plumbed

* test that disabling backtraces works

* code review comments

* fuzzing generator: wasm_backtrace is a runtime config now

* doc fix
2022-05-25 12:25:50 -07:00
Alex Crichton
a02a609528 Make ValRaw fields private (#4186)
* Make `ValRaw` fields private

Force accessing to go through constructors and accessors to localize the
knowledge about little-endian-ness. This is spawned since I made a
mistake in #4039 about endianness.

* Fix some tests

* Component model changes
2022-05-24 19:14:29 -05:00
wasmtime-publish
9a6854456d Bump Wasmtime to 0.38.0 (#4103)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-05-05 13:43:02 -05:00
Alex Crichton
7fdc616368 Remove the Paged memory initialization variant (#4046)
* Remove the `Paged` memory initialization variant

This commit simplifies the `MemoryInitialization` enum by removing the
`Paged` variant. The `Paged` variant was originally added for uffd, but
that support has now been removed in #4040. This is no longer necessary
but is still used as an intermediate step of becoming a `Static` variant
of initialized memory (which copy-on-write uses). As a result this
commit largely modifies the static initialization of memory steps and
folds the two methods together.

* Apply suggestions from code review

Co-authored-by: Peter Huene <peter@huene.dev>

Co-authored-by: Peter Huene <peter@huene.dev>
2022-05-05 09:44:48 -05:00
Andrew Brown
3dbdcfa220 runtime: refactor Memory to always use Box<dyn RuntimeLinearMemory> (#4086)
While working with the runtime `Memory` object, it became clear that
some refactoring was needed. In order to implement shared memory from
the threads proposal, we must be able to atomically change the memory
size. Previously, the split into variants, `Memory::Static` and
`Memory::Dynamic`, made any attempt to lock forced us to duplicate logic
in various places.

This change moves `enum Memory { Static..., Dynamic... }` to simply
`struct Memory(Box<dyn RuntimeLinearMemory>)`. A new type,
`ExternalMemory`, takes the place of `Memory::Static` and also
implements the `RuntimeLinearMemory` trait, allowing `Memory` to contain
the same two options as before: `MmapMemory` for `Memory::Dynamic` and
`ExternalMemory` for `Memory::Static`. To interface with the
`PoolingAllocator`, this change also required the ability to downcast to
the internal representation.
2022-04-29 08:12:38 -07:00
Dan Gohman
321124ad21 Update to rustix 0.33.7. (#4052)
This pulls in the fix for bytecodealliance/rustix#285, which fixes a
failure in the WASI `time` APIs on powerpc64.
2022-04-19 16:27:56 -07:00
Alex Crichton
90791a0e32 Reduce contention on the global module rwlock (#4041)
* Reduce contention on the global module rwlock

This commit intendes to close #4025 by reducing contention on the global
rwlock Wasmtime has for module information during instantiation and
dropping a store. Currently registration of a module into this global
map happens during instantiation, but this can be a hot path as
embeddings may want to, in parallel, instantiate modules.

Instead this switches to a strategy of inserting into the global module
map when a `Module` is created and then removing it from the map when
the `Module` is dropped. Registration in a `Store` now preserves the
entire `Module` within the store as opposed to trying to only save it
piecemeal. In reality the only piece that wasn't saved within a store
was the `TypeTables` which was pretty inconsequential for core wasm
modules anyway.

This means that instantiation should now clone a singluar `Arc` into a
`Store` per `Module` (previously it cloned two) with zero managemnt on
the global rwlock as that happened at `Module` creation time.
Additionally dropping a `Store` again involves zero rwlock management
and only a single `Arc` drop per-instantiated module (previously it was
two).

In the process of doing this I also went ahead and removed the
`Module::new_with_name` API. This has been difficult to support
historically with various variations on the internals of `ModuleInner`
because it involves mutating a `Module` after it's been created. My hope
is that this API is pretty rarely used and/or isn't super important, so
it's ok to remove.

Finally this change removes some internal `Arc` layerings that are no
longer necessary, attempting to use either `T` or `&T` where possible
without dealing with the overhead of an `Arc`.

Closes #4025

* Move back to a `BTreeMap` in `ModuleRegistry`
2022-04-19 15:13:47 -05:00
Alex Crichton
3f3afb455e Remove support for userfaultfd (#4040)
This commit removes support for the `userfaultfd` or "uffd" syscall on
Linux. This support was originally added for users migrating from Lucet
to Wasmtime, but the recent developments of kernel-supported
copy-on-write support for memory initialization wound up being more
appropriate for these use cases than usefaultfd. The main reason for
moving to copy-on-write initialization are:

* The `userfaultfd` feature was never necessarily intended for this
  style of use case with wasm and was susceptible to subtle and rare
  bugs that were extremely difficult to track down. We were never 100%
  certain that there were kernel bugs related to userfaultfd but the
  suspicion never went away.

* Handling faults with userfaultfd was always slow and single-threaded.
  Only one thread could handle faults and traveling to user-space to
  handle faults is inherently slower than handling them all in the
  kernel. The single-threaded aspect in particular presented a
  significant scaling bottleneck for embeddings that want to run many
  wasm instances in parallel.

* One of the major benefits of userfaultfd was lazy initialization of
  wasm linear memory which is also achieved with the copy-on-write
  initialization support we have right now.

* One of the suspected benefits of userfaultfd was less frobbing of the
  kernel vma structures when wasm modules are instantiated. Currently
  the copy-on-write support has a mitigation where we attempt to reuse
  the memory images where possible to avoid changing vma structures.
  When comparing this to userfaultfd's performance it was found that
  kernel modifications of vmas aren't a worrisome bottleneck so
  copy-on-write is suitable for this as well.

Overall there are no remaining benefits that userfaultfd gives that
copy-on-write doesn't, and copy-on-write solves a major downsides of
userfaultfd, the scaling issue with a single faulting thread.
Additionally copy-on-write support seems much more robust in terms of
kernel implementation since it's only using standard memory-management
syscalls which are heavily exercised. Finally copy-on-write support
provides a new bonus where read-only memory in WebAssembly can be mapped
directly to the same kernel cache page, even amongst many wasm instances
of the same module, which was never possible with userfaultfd.

In light of all this it's expected that all users of userfaultfd should
migrate to the copy-on-write initialization of Wasmtime (which is
enabled by default).
2022-04-18 12:42:26 -05:00
Alex Crichton
51d82aebfd Store the ValRaw type in little-endian format (#4035)
* Store the `ValRaw` type in little-endian format

This commit changes the internal representation of the `ValRaw` type to
an unconditionally little-endian format instead of its current
native-endian format. The documentation and various accessors here have
been updated as well as the associated trampolines that read `ValRaw`
to always work with little-endian values, converting to the host
endianness as necessary.

The motivation for this change originally comes from the implementation
of the component model that I'm working on. One aspect of the component
model's canonical ABI is how variants are passed to functions as
immediate arguments. For example for a component model function:

```
foo: function(x: expected<i32, f64>)
```

This translates to a core wasm function:

```wasm
(module
  (func (export "foo") (param i32 i64)
    ;; ...
  )
)
```

The first `i32` parameter to the core wasm function is the discriminant
of whether the result is an "ok" or an "err". The second `i64`, however,
is the "join" operation on the `i32` and `f64` payloads. Essentially
these two types are unioned into one type to get passed into the function.

Currently in the implementation of the component model my plan is to
construct a `*mut [ValRaw]` to pass through to WebAssembly, always
invoking component exports through host trampolines. This means that the
implementation for `Result<T, E>` needs to do the correct "join"
operation here when encoding a particular case into the corresponding
`ValRaw`.

I personally found this particularly tricky to do structurally. The
solution that I settled on with fitzgen was that if `ValRaw` was always
stored in a little endian format then we could employ a trick where when
encoding a variant we first set all the `ValRaw` slots to zero, then the
associated case we have is encoding. Afterwards the `ValRaw` values are
already encoded into the correct format as if they'd been "join"ed.

For example if we were to encode `Ok(1i32)` then this would produce
`ValRaw { i32: 1 }`, which memory-wise is equivalent to `ValRaw { i64: 1 }`
if the other bytes in the `ValRaw` are guaranteed to be zero. Similarly
storing `ValRaw { f64 }` is equivalent to the storage required for
`ValRaw { i64 }` here in the join operation.

Note, though, that this equivalence relies on everything being
little-endian. Otherwise the in-memory representations of `ValRaw { i32: 1 }`
and `ValRaw { i64: 1 }` are different.

That motivation is what leads to this change. It's expected that this is
a low-to-zero cost change in the sense that little-endian platforms will
see no change and big-endian platforms are already required to
efficiently byte-swap loads/stores as WebAssembly requires that.
Additionally the `ValRaw` type is an esoteric niche use case primarily
used for accelerating the C API right now, so it's expected that not
many users will have to update for this change.

* Track down some more endianness conversions
2022-04-14 13:09:32 -05:00
Yang Hau
bfae6384aa fix typo (#4030) 2022-04-14 09:35:53 -05:00
Dan Gohman
ade04c92c2 Update to rustix 0.33.6. (#4022)
Relevant to Wasmtime, this fixes undefined references to `utimensat` and
`futimens` on macOS 10.12 and earlier. See bytecodealliance/rustix#157
for details.

It also contains a fix for s390x which isn't currently needed by Wasmtime
itself, but which is needed to make rustix's own testsuite pass on s390x,
which helps people packaging rustix for use in Wasmtime. See
bytecodealliance/rustix#277 for details.
2022-04-13 11:51:57 -05:00
wasmtime-publish
78a595ac88 Bump Wasmtime to 0.37.0 (#3994)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-04-05 09:24:28 -05:00
Alex Crichton
7b5176baea Upgrade all crates to the Rust 2021 edition (#3991)
* Upgrade all crates to the Rust 2021 edition

I've personally started using the new format strings for things like
`panic!("some message {foo}")` or similar and have been upgrading crates
on a case-by-case basis, but I think it probably makes more sense to go
ahead and blanket upgrade everything so 2021 features are always
available.

* Fix compile of the C API

* Fix a warning

* Fix another warning
2022-04-04 12:27:12 -05:00
Alex Crichton
c89dc55108 Add a two-week delay to Wasmtime's release process (#3955)
* Bump to 0.36.0

* Add a two-week delay to Wasmtime's release process

This commit is a proposal to update Wasmtime's release process with a
two-week delay from branching a release until it's actually officially
released. We've had two issues lately that came up which led to this proposal:

* In #3915 it was realized that changes just before the 0.35.0 release
  weren't enough for an embedding use case, but the PR didn't meet the
  expectations for a full patch release.

* At Fastly we were about to start rolling out a new version of Wasmtime
  when over the weekend the fuzz bug #3951 was found. This led to the
  desire internally to have a "must have been fuzzed for this long"
  period of time for Wasmtime changes which we felt were better
  reflected in the release process itself rather than something about
  Fastly's own integration with Wasmtime.

This commit updates the automation for releases to unconditionally
create a `release-X.Y.Z` branch on the 5th of every month. The actual
release from this branch is then performed on the 20th of every month,
roughly two weeks later. This should provide a period of time to ensure
that all changes in a release are fuzzed for at least two weeks and
avoid any further surprises. This should also help with any last-minute
changes made just before a release if they need tweaking since
backporting to a not-yet-released branch is much easier.

Overall there are some new properties about Wasmtime with this proposal
as well:

* The `main` branch will always have a section in `RELEASES.md` which is
  listed as "Unreleased" for us to fill out.
* The `main` branch will always be a version ahead of the latest
  release. For example it will be bump pre-emptively as part of the
  release process on the 5th where if `release-2.0.0` was created then
  the `main` branch will have 3.0.0 Wasmtime.
* Dates for major versions are automatically updated in the
  `RELEASES.md` notes.

The associated documentation for our release process is updated and the
various scripts should all be updated now as well with this commit.

* Add notes on a security patch

* Clarify security fixes shouldn't be previewed early on CI
2022-04-01 13:11:10 -05:00
Alex Crichton
353f1b48ab Split wasmtime-runtime's single getter into typed getters (#3987)
This splits the existing `lookup_by_declaration` function into a
lookup-per-type-of-item. This refactor ends up cleaning up a fair bit of
code in the `wasmtime` crate by removing a number of `unreachable!()`
blocks which are now no longer necessary.
2022-03-31 16:24:42 -05:00
Alex Crichton
453feb6f82 Remove some dead code (#3970)
This commit removes methods that are never used between crates or trait
impls like `Clone` which may have been used one day but are no longer used.
2022-03-30 13:51:34 -05:00
Dan Gohman
819b61b661 Update to rustix 0.33.5, to fix a link error on Android (#3966)
* Update to rustix 0.33.5, to fix a link error on Android

This updates to rustix 0.33.5, which includes bytecodealliance/rustix#258,
which fixes bytecodealliance/rustix#256, a link error on Android.

Fixes #3965.

* Bump the rustix versions in the Cargo.toml files too.
2022-03-29 10:17:10 -07:00
Alex Crichton
76b82910c9 Remove the module linking implementation in Wasmtime (#3958)
* Remove the module linking implementation in Wasmtime

This commit removes the experimental implementation of the module
linking WebAssembly proposal from Wasmtime. The module linking is no
longer intended for core WebAssembly but is instead incorporated into
the component model now at this point. This means that very large parts
of Wasmtime's implementation of module linking are no longer applicable
and would change greatly with an implementation of the component model.

The main purpose of this is to remove Wasmtime's reliance on the support
for module-linking in `wasmparser` and tooling crates. With this
reliance removed we can move over to the `component-model` branch of
`wasmparser` and use the updated support for the component model.
Additionally given the trajectory of the component model proposal the
embedding API of Wasmtime will not look like what it looks like today
for WebAssembly. For example the core wasm `Instance` will not change
and instead a `Component` is likely to be added instead.

Some more rationale for this is in #3941, but the basic idea is that I
feel that it's not going to be viable to develop support for the
component model on a non-`main` branch of Wasmtime. Additionaly I don't
think it's viable, for the same reasons as `wasm-tools`, to support the
old module linking proposal and the new component model at the same
time.

This commit takes a moment to not only delete the existing module
linking implementation but some abstractions are also simplified. For
example module serialization is a bit simpler that there's only one
module. Additionally instantiation is much simpler since the only
initializer we have to deal with are imports and nothing else.

Closes #3941

* Fix doc link

* Update comments
2022-03-23 14:57:34 -05:00
Alex Crichton
3f9bff17c8 Support disabling backtraces at compile time (#3932)
* Support disabling backtraces at compile time

This commit adds support to Wasmtime to disable, at compile time, the
gathering of backtraces on traps. The `wasmtime` crate now sports a
`wasm-backtrace` feature which, when disabled, will mean that backtraces
are never collected at compile time nor are unwinding tables inserted
into compiled objects.

The motivation for this commit stems from the fact that generating a
backtrace is quite a slow operation. Currently backtrace generation is
done with libunwind and `_Unwind_Backtrace` typically found in glibc or
other system libraries. When thousands of modules are loaded into the
same process though this means that the initial backtrace can take
nearly half a second and all subsequent backtraces can take upwards of
hundreds of milliseconds. Relative to all other operations in Wasmtime
this is extremely expensive at this time. In the future we'd like to
implement a more performant backtrace scheme but such an implementation
would require coordination with Cranelift and is a big chunk of work
that may take some time, so in the meantime if embedders don't need a
backtrace they can still use this option to disable backtraces at
compile time and avoid the performance pitfalls of collecting
backtraces.

In general I tried to originally make this a runtime configuration
option but ended up opting for a compile-time option because `Trap::new`
otherwise has no arguments and always captures a backtrace. By making
this a compile-time option it was possible to configure, statically, the
behavior of `Trap::new`. Additionally I also tried to minimize the
amount of `#[cfg]` necessary by largely only having it at the producer
and consumer sites.

Also a noteworthy restriction of this implementation is that if
backtrace support is disabled at compile time then reference types
support will be unconditionally disabled at runtime. With backtrace
support disabled there's no way to trace the stack of wasm frames which
means that GC can't happen given our current implementation.

* Always enable backtraces for the C API
2022-03-16 09:18:16 -05:00
Alex Crichton
c22033bf93 Delete historical interruptable support in Wasmtime (#3925)
* Delete historical interruptable support in Wasmtime

This commit removes the `Config::interruptable` configuration along with
the `InterruptHandle` type from the `wasmtime` crate. The original
support for adding interruption to WebAssembly was added pretty early on
in the history of Wasmtime when there was no other method to prevent an
infinite loop from the host. Nowadays, however, there are alternative
methods for interruption such as fuel or epoch-based interruption.

One of the major downsides of `Config::interruptable` is that even when
it's not enabled it forces an atomic swap to happen when entering
WebAssembly code. This technically could be a non-atomic swap if the
configuration option isn't enabled but that produces even more branch-y
code on entry into WebAssembly which is already something we try to
optimize. Calling into WebAssembly is on the order of a dozens of
nanoseconds at this time and an atomic swap, even uncontended, can add
up to 5ns on some platforms.

The main goal of this PR is to remove this atomic swap on entry into
WebAssembly. This is done by removing the `Config::interruptable` field
entirely, moving all existing consumers to epochs instead which are
suitable for the same purposes. This means that the stack overflow check
is no longer entangled with the interruption check and perhaps one day
we could continue to optimize that further as well.

Some consequences of this change are:

* Epochs are now the only method of remote-thread interruption.
* There are no more Wasmtime traps that produces the `Interrupted` trap
  code, although we may wish to move future traps to this so I left it
  in place.
* The C API support for interrupt handles was also removed and bindings
  for epoch methods were added.
* Function-entry checks for interruption are a tiny bit less efficient
  since one check is performed for the stack limit and a second is
  performed for the epoch as opposed to the `Config::interruptable`
  style of bundling the stack limit and the interrupt check in one. It's
  expected though that this is likely to not really be measurable.
* The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.
2022-03-14 15:25:11 -05:00
Alex Crichton
62a6a7ab6c Use const-initialized thread locals (#3923)
This was a relatively recent feature added to the Rust standard library
which should help accelerate calls into WebAssembly slightly.
2022-03-14 12:29:58 -05:00
wasmtime-publish
9137b4a50e Bump Wasmtime to 0.35.0 (#3885)
[automatically-tag-and-release-this-commit]

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-03-07 15:18:34 -06:00
Andrew Brown
a7567cb9ec typo: fix typos in documentation comments (#3882) 2022-03-04 10:16:37 -08:00
Alex Crichton
1fb71fa1ea Remove some asserts in MemoryImage::new (#3874)
This commit removes some `.unwrap()` annotations around casts between
integers to either be infallible or handle errors. This fixes a panic in
a fuzz test case that popped out for memory64-using modules. The actual
issue here is pretty benign, we were just too eager about assuming
things fit into 32-bit.
2022-03-02 14:04:59 -06:00
Alex Crichton
2a6969d2bd Shrink the size of the anyfunc table in VMContext (#3850)
* Shrink the size of the anyfunc table in `VMContext`

This commit shrinks the size of the `VMCallerCheckedAnyfunc` table
allocated into a `VMContext` to be the size of the number of "escaped"
functions in a module rather than the number of functions in a module.
Escaped functions include exports, table elements, etc, and are
typically an order of magnitude smaller than the number of functions in
general. This should greatly shrink the `VMContext` for some modules
which while we aren't necessarily having any problems with that today
shouldn't cause any problems in the future.

The original motivation for this was that this came up during the recent
lazy-table-initialization work and while it no longer has a direct
performance benefit since tables aren't initialized at all on
instantiation it should still improve long-running instances
theoretically with smaller `VMContext` allocations as well as better
locality between anyfuncs.

* Fix some tests

* Remove redundant hash set

* Use a helper for pushing function type information

* Use a more descriptive `is_escaping` method

* Clarify a comment

* Fix condition
2022-02-28 10:11:04 -06:00
Alex Crichton
15bb0c6903 Remove the ModuleLimits pooling configuration structure (#3837)
* Remove the `ModuleLimits` pooling configuration structure

This commit is an attempt to improve the usability of the pooling
allocator by removing the need to configure a `ModuleLimits` structure.
Internally this structure has limits on all forms of wasm constructs but
this largely bottoms out in the size of an allocation for an instance in
the instance pooling allocator. Maintaining this list of limits can be
cumbersome as modules may get tweaked over time and there's otherwise no
real reason to limit the number of globals in a module since the main
goal is to limit the memory consumption of a `VMContext` which can be
done with a memory allocation limit rather than fine-tuned control over
each maximum and minimum.

The new approach taken in this commit is to remove `ModuleLimits`. Some
fields, such as `tables`, `table_elements` , `memories`, and
`memory_pages` are moved to `InstanceLimits` since they're still
enforced at runtime. A new field `size` is added to `InstanceLimits`
which indicates, in bytes, the maximum size of the `VMContext`
allocation. If the size of a `VMContext` for a module exceeds this value
then instantiation will fail.

This involved adding a few more checks to `{Table, Memory}::new_static`
to ensure that the minimum size is able to fit in the allocation, since
previously modules were validated at compile time of the module that
everything fit and that validation no longer happens (it happens at
runtime).

A consequence of this commit is that Wasmtime will have no built-in way
to reject modules at compile time if they'll fail to be instantiated
within a particular pooling allocator configuration. Instead a module
must attempt instantiation see if a failure happens.

* Fix benchmark compiles

* Fix some doc links

* Fix a panic by ensuring modules have limited tables/memories

* Review comments

* Add back validation at `Module` time instantiation is possible

This allows for getting an early signal at compile time that a module
will never be instantiable in an engine with matching settings.

* Provide a better error message when sizes are exceeded

Improve the error message when an instance size exceeds the maximum by
providing a breakdown of where the bytes are all going and why the large
size is being requested.

* Try to fix test in qemu

* Flag new test as 64-bit only

Sizes are all specific to 64-bit right now
2022-02-25 09:11:51 -06:00
Alex Crichton
49c2b1e60a Fix image reuse with multi-memory images (#3846)
This commit fixes a potential issue where the fast-path instantiate in
`MemoryImageSlot` where when the previous image is compared against the
new image it only performed file descriptor equality, but nowadays with
loading images from `*.cwasm` files there might be multiple images in
the same file so the offsets also need to be considered. I think this
isn't really easy to hit today, it would require combining both module
linking and multi-memory which gets into the realm of being pretty
esoteric so I haven't added a test case here for this.
2022-02-23 16:41:38 -06:00
Alex Crichton
434e35c490 Panic on resetting image slots back to anonymous memory (#3841)
* Panic on resetting image slots back to anonymous memory

This commit updates `Drop for MemoryImageSlot` to panic instead of
ignoring errors when resetting memory back to a clean slate. On reading
some of this code again for a different change I realized that if an
error happens in `reset_with_anon_memory` it would be possible,
depending on where another error happened, to leak memory from one image
to another.

For example if `clear_and_remain_ready` failed its `madvise` (for
whatever reason) and didn't actually reset any memory, then if `Drop for
MemoryImageSlot` also hit an error trying to remap memory (for whatever
reason), then nothing about memory has changed and when the
`MemoryImageSlot` is recreated it'll think that it's 0-length when
actually it's a bit larger and may leak data.

I don't think this is a serious problem since we don't know any
situation under which the `madvise` would fail and/or the resetting with
anonymous memory, but given that these aren't expected to fail I figure
it's best to be a bit more defensive here and/or loud about failures.

* Update a comment
2022-02-23 14:00:06 -06:00
Alex Crichton
bbd4a4a500 Enable copy-on-write heap initialization by default (#3825)
* Enable copy-on-write heap initialization by default

This commit enables the `Config::memfd` feature by default now that it's
been fuzzed for a few weeks on oss-fuzz, and will continue to be fuzzed
leading up to the next release of Wasmtime in early March. The
documentation of the `Config` option has been updated as well as adding
a CLI flag to disable the feature.

* Remove ubiquitous "memfd" terminology

Switch instead to forms of "memory image" or "cow" or some combination
thereof.

* Update new option names
2022-02-22 17:12:18 -06:00
bjorn3
4ed353a7e1 Extract jit_int.rs and most of jitdump_linux.rs for use outside of wasmtime (#2744)
* Extract gdb jit_int into wasmtime-jit-debug

* Move a big chunk of the jitdump code to wasmtime-jit-debug

* Fix doc markdown in perf_jitdump.rs
2022-02-22 09:23:44 -08:00
Peter Huene
ef17a36852 Port fix for CVE-2022-23636 to main. (#3818)
* Port fix for `CVE-2022-23636` to `main`.

This commit ports the fix for `CVE-2022-23636` to `main`, but performs a
refactoring that makes it unnecessary for the instance itself to track if it
has been initialized; such a change was not targeted enough for a security
patch.

The pooling allocator will now only initialize an instance if all of its
associated resource creation succeeds. If the resource creation fails, no
instance is dropped as none was initialized.

Also updates `RELEASES.md` to include the related patch releases.

* Add `Instance::new_at` to fully initialize an instance.

Added `Instance::new_at` to fully initialize an instance at a given address.

This will hopefully prevent the possibility that an `Instance` structure
doesn't have an initialized `VMContext` when it is dropped.
2022-02-16 17:51:14 -06:00
Alex Crichton
b438617e12 Further minor optimizations to instantiation (#3791)
* Shrink the size of `FuncData`

Before this commit on a 64-bit system the `FuncData` type had a size of
88 bytes and after this commit it has a size of 32 bytes. A `FuncData`
is required for all host functions in a store, including those inserted
from a `Linker` into a store used during linking. This means that
instantiation ends up creating a nontrivial number of these types and
pushing them into the store. Looking at some profiles there were some
surprisingly expensive movements of `FuncData` from the stack to a
vector for moves-by-value generated by Rust. Shrinking this type enables
more efficient code to be generated and additionally means less storage
is needed in a store's function array.

For instantiating the spidermonkey and rustpython modules this improves
instantiation by 10% since they each import a fair number of host
functions and the speedup here is relative to the number of items
imported.

* Use `ptr::copy_nonoverlapping` during initialization

Prevoiusly `ptr::copy` was used for copying imports into place which
translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here
since it's statically known these areas don't overlap. While this
doesn't end up having a performance difference it's something I kept
noticing while looking at the disassembly of `initialize_vmcontext` so I
figured I'd go ahead and implement.

* Indirect shared signature ids in the VMContext

This commit is a small improvement for the instantiation time of modules
by avoiding copying a list of `VMSharedSignatureIndex` entries into each
`VMContext`, instead building one inside of a module and sharing that
amongst all instances. This involves less lookups at instantiation time
and less movement of data during instantiation. The downside is that
type-checks on `call_indirect` now involve an additionally load, but I'm
assuming that these are somewhat pessimized enough as-is that the
runtime impact won't be much there.

For instantiation performance this is a 5-10% win with
rustpyhon/spidermonky instantiation. This should also reduce the size of
each `VMContext` for an instantiation since signatures are no longer
stored inline but shared amongst all instances with one module.

Note that one subtle change here is that the array of
`VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now
it's indexed by `SignaturedIndex` which is a deduplicated form of
`TypeIndex`. This is done because we already had a list of those lying
around in `Module`, so it was easier to reuse that than to build a
separate array and store it somewhere.

* Reserve space in `Store<T>` with `InstancePre`

This commit updates the instantiation process to reserve space in a
`Store<T>` for the functions that an `InstancePre<T>`, as part of
instantiation, will insert into it. Using an `InstancePre<T>` to
instantiate allows pre-computing the number of host functions that will
be inserted into a store, and by pre-reserving space we can avoid costly
reallocations during instantiation by ensuring the function vector has
enough space to fit everything during the instantiation process.

Overall this makes instantiation of rustpython/spidermonkey about 8%
faster locally.

* Fix tests

* Use checked arithmetic
2022-02-11 09:55:08 -06:00
Alex Crichton
c0c368d151 Use mmap'd *.cwasm as a source for memory initialization images (#3787)
* Skip memfd creation with precompiled modules

This commit updates the memfd support internally to not actually use a
memfd if a compiled module originally came from disk via the
`wasmtime::Module::deserialize_file` API. In this situation we already
have a file descriptor open and there's no need to copy a module's heap
image to a new file descriptor.

To facilitate a new source of `mmap` the currently-memfd-specific-logic
of creating a heap image is generalized to a new form of
`MemoryInitialization` which is attempted for all modules at
module-compile-time. This means that the serialized artifact to disk
will have the memory image in its entirety waiting for us. Furthermore
the memory image is ensured to be padded and aligned carefully to the
target system's page size, notably meaning that the data section in the
final object file is page-aligned and the size of the data section is
also page aligned.

This means that when a precompiled module is mapped from disk we can
reuse the underlying `File` to mmap all initial memory images. This
means that the offset-within-the-memory-mapped-file can differ for
memfd-vs-not, but that's just another piece of state to track in the
memfd implementation.

In the limit this waters down the term "memfd" for this technique of
quickly initializing memory because we no longer use memfd
unconditionally (only when the backing file isn't available).
This does however open up an avenue in the future to porting this
support to other OSes because while `memfd_create` is Linux-specific
both macOS and Windows support mapping a file with copy-on-write. This
porting isn't done in this PR and is left for a future refactoring.

Closes #3758

* Enable "memfd" support on all unix systems

Cordon off the Linux-specific bits and enable the memfd support to
compile and run on platforms like macOS which have a Linux-like `mmap`.
This only works if a module is mapped from a precompiled module file on
disk, but that's better than not supporting it at all!

* Fix linux compile

* Use `Arc<File>` instead of `MmapVecFileBacking`

* Use a named struct instead of mysterious tuples

* Comment about unsafety in `Module::deserialize_file`

* Fix tests

* Fix uffd compile

* Always align data segments

No need to have conditional alignment since their sizes are all aligned
anyway

* Update comment in build.rs

* Use rustix, not `region`

* Fix some confusing logic/names around memory indexes

These functions all work with memory indexes, not specifically defined
memory indexes.
2022-02-10 15:40:40 -06:00
Chris Fallin
39a52ceb4f Implement lazy funcref table and anyfunc initialization. (#3733)
During instance initialization, we build two sorts of arrays eagerly:

- We create an "anyfunc" (a `VMCallerCheckedAnyfunc`) for every function
  in an instance.

- We initialize every element of a funcref table with an initializer to
  a pointer to one of these anyfuncs.

Most instances will not touch (via call_indirect or table.get) all
funcref table elements. And most anyfuncs will never be referenced,
because most functions are never placed in tables or used with
`ref.func`. Thus, both of these initialization tasks are quite wasteful.
Profiling shows that a significant fraction of the remaining
instance-initialization time after our other recent optimizations is
going into these two tasks.

This PR implements two basic ideas:

- The anyfunc array can be lazily initialized as long as we retain the
  information needed to do so. For now, in this PR, we just recreate the
  anyfunc whenever a pointer is taken to it, because doing so is fast
  enough; in the future we could keep some state to know whether the
  anyfunc has been written yet and skip this work if redundant.

  This technique allows us to leave the anyfunc array as uninitialized
  memory, which can be a significant savings. Filling it with
  initialized anyfuncs is very expensive, but even zeroing it is
  expensive: e.g. in a large module, it can be >500KB.

- A funcref table can be lazily initialized as long as we retain a link
  to its corresponding instance and function index for each element. A
  zero in a table element means "uninitialized", and a slowpath does the
  initialization.

Funcref tables are a little tricky because funcrefs can be null. We need
to distinguish "element was initially non-null, but user stored explicit
null later" from "element never touched" (ie the lazy init should not
blow away an explicitly stored null). We solve this by stealing the LSB
from every funcref (anyfunc pointer): when the LSB is set, the funcref
is initialized and we don't hit the lazy-init slowpath. We insert the
bit on storing to the table and mask it off after loading.

We do have to set up a precomputed array of `FuncIndex`s for the table
in order for this to work. We do this as part of the module compilation.

This PR also refactors the way that the runtime crate gains access to
information computed during module compilation.

Performance effect measured with in-tree benches/instantiation.rs, using
SpiderMonkey built for WASI, and with memfd enabled:

```
BEFORE:

sequential/default/spidermonkey.wasm
                        time:   [68.569 us 68.696 us 68.856 us]
sequential/pooling/spidermonkey.wasm
                        time:   [69.406 us 69.435 us 69.465 us]

parallel/default/spidermonkey.wasm: with 1 background thread
                        time:   [69.444 us 69.470 us 69.497 us]
parallel/default/spidermonkey.wasm: with 16 background threads
                        time:   [183.72 us 184.31 us 184.89 us]
parallel/pooling/spidermonkey.wasm: with 1 background thread
                        time:   [69.018 us 69.070 us 69.136 us]
parallel/pooling/spidermonkey.wasm: with 16 background threads
                        time:   [326.81 us 337.32 us 347.01 us]

WITH THIS PR:

sequential/default/spidermonkey.wasm
                        time:   [6.7821 us 6.8096 us 6.8397 us]
                        change: [-90.245% -90.193% -90.142%] (p = 0.00 < 0.05)
                        Performance has improved.
sequential/pooling/spidermonkey.wasm
                        time:   [3.0410 us 3.0558 us 3.0724 us]
                        change: [-95.566% -95.552% -95.537%] (p = 0.00 < 0.05)
                        Performance has improved.

parallel/default/spidermonkey.wasm: with 1 background thread
                        time:   [7.2643 us 7.2689 us 7.2735 us]
                        change: [-89.541% -89.533% -89.525%] (p = 0.00 < 0.05)
                        Performance has improved.
parallel/default/spidermonkey.wasm: with 16 background threads
                        time:   [147.36 us 148.99 us 150.74 us]
                        change: [-18.997% -18.081% -17.285%] (p = 0.00 < 0.05)
                        Performance has improved.
parallel/pooling/spidermonkey.wasm: with 1 background thread
                        time:   [3.1009 us 3.1021 us 3.1033 us]
                        change: [-95.517% -95.511% -95.506%] (p = 0.00 < 0.05)
                        Performance has improved.
parallel/pooling/spidermonkey.wasm: with 16 background threads
                        time:   [49.449 us 50.475 us 51.540 us]
                        change: [-85.423% -84.964% -84.465%] (p = 0.00 < 0.05)
                        Performance has improved.
```

So an improvement of something like 80-95% for a very large module (7420
functions in its one funcref table, 31928 functions total).
2022-02-09 13:56:53 -08:00
Peter Huene
1b27508a42 Fix incorrect use of MemoryIndex in the pooling allocator. (#3782)
This commit corrects a few places where `MemoryIndex` was used and treated like
a `DefinedMemoryIndex` in the pooling instance allocator.

When the unstable `multi-memory` proposal is enabled, it is possible to cause a
newly allocated instance to use an incorrect base address for any defined
memories by having the module being instantiated also import a memory.

This requires enabling the unstable `multi-memory` proposal, configuring the
use of the pooling instance allocator (not the default), and then configuring
the module limits to allow imported memories (also not the default).

The fix is to replace all uses of `MemoryIndex` with `DefinedMemoryIndex` in
the pooling instance allocator.

Several `debug_assert!` have also been updated to `assert!` to sanity check the
state of the pooling allocator even in release builds.
2022-02-09 09:39:29 -06:00
Chris Fallin
4f01711d42 Pooling allocator: Default for allocation policy should use memfd feature, not memfd-allocator. (#3777)
Thanks to @peterheune for noticing this!
2022-02-08 10:29:45 -08:00
wasmtime-publish
39b88e4e9e Release Wasmtime 0.34.0 (#3768)
* Bump Wasmtime to 0.34.0

[automatically-tag-and-release-this-commit]

* Add release notes for 0.34.0

* Update release date to today

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2022-02-07 19:16:26 -06:00
Chris Fallin
ddd39cdb84 Patch qemu in CI to fix madvise semantics. (#3770)
We currently skip some tests when running our qemu-based tests for
aarch64 and s390x. Qemu has broken madvise(MADV_DONTNEED) semantics --
specifically, it just ignores madvise() [1].

We could continue to whack-a-mole the tests whenever we create new
functionality that relies on madvise() semantics, but ideally we'd just
have emulation that properly emulates!

The earlier discussions on the qemu mailing list [2] had a proposed
patch for this, but (i) this patch doesn't seem to apply cleanly anymore
(it's 3.5 years old) and (ii) it's pretty complex due to the need to
handle qemu's ability to emulate differing page sizes on host and guest.

It turns out that we only really need this for CI when host and guest
have the same page size (4KiB), so we *could* just pass the madvise()s
through. I wouldn't expect such a patch to ever land upstream in qemu,
but it satisfies our needs I think. So this PR modifies our CI setup to
patch qemu before building it locally with a little one-off patch.

[1]
https://github.com/bytecodealliance/wasmtime/pull/2518#issuecomment-747280133

[2]
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg05416.html
2022-02-07 15:56:54 -08:00
Alex Crichton
04d2caea7b Consolidate methods of memory initialization (#3766)
* Consolidate methods of memory initialization

This commit consolidates the few locations that we have which are
performing memory initialization. Namely the uffd logic for creating
paged memory as well as the memfd logic for creating a memory image now
share an implementation to avoid duplicating bounds-checks or other
validation conditions. The main purpose of this commit is to fix a
fuzz-bug where a multiplication overflowed. The overflow itself was
benign but it seemed better to fix the overflow in only one place
instead of multiple.

The overflow in question is specifically when an initializer is checked
to be statically out-of-bounds and multiplies a memory's minimum size by
the wasm page size, returning the result as a `u64`. For
memory64-memories of size `1 << 48` this multiplication will overflow.
This was actually a preexisting bug with the `try_paged_init` function
which was copied for memfd, but cropped up here since memfd is used more
often than paged initialization. The fix here is to skip validation of
the `end` index if the size of memory is `1 << 64` since if the `end`
index can be represented as a `u64` then it's in-bounds. This is
somewhat of an esoteric case, though, since a memory of minimum size `1
<< 64` can't ever exist (we can't even ask the os for that much memory,
and even if we could it would fail).

* Fix memfd test

* Fix some tests

* Remove InitMemory enum

* Add an `is_segmented` helper method

* More clear variable name

* Make arguments to `init_memory` more descriptive
2022-02-04 13:17:25 -06:00
Alex Crichton
4ba3404ca3 Fix MemFd's allocated memory for dynamic memories (#3763)
This fixes a bug in the memfd-related management of a linear memory
where for dynamic memories memfd wasn't informed of the extra room that
the dynamic memory could grow into, only the actual size of linear
memory, which ended up tripping an assert once the memory was grown. The
fix here is pretty simple which is to factor in this extra space when
passing the allocation size to the creation of the `MemFdSlot`.
2022-02-03 11:56:16 -06:00
Alex Crichton
b647561c44 memfd: Some minor follow-ups (#3759)
* Tweak memfd-related features crates

This commit changes the `memfd` feature for the `wasmtime-cli` crate
from an always-on feature to a default-on feature which can be disabled
at compile time. Additionally the `pooling-allocator` feature is also
given similar treatment.

Additionally some documentation was added for the `memfd` feature on the
`wasmtime` crate.

* Don't store `Arc<T>` in `InstanceAllocationRequest`

Instead store `&Arc<T>` to avoid having the clone that lives in
`InstanceAllocationRequest` not actually going anywhere. Otherwise all
instance allocation requires an extra clone to create it for the request
and an extra decrement when the request goes away. Internally clones are
made as necessary when creating instances.

* Enable the pooling allocator by default for `wasmtime-cli`

While perhaps not the most useful option since the CLI doesn't have a
great way to take advantage of this it probably makes sense to at least
match the features of `wasmtime` itself.

* Fix some lints and issues

* More compile fixes
2022-02-03 09:17:04 -06:00
Alex Crichton
8ed79c8f57 memfd: Reduce some syscalls in the on-demand case (#3757)
* memfd: Reduce some syscalls in the on-demand case

This tweaks the internal organization of the `MemFdSlot` to avoid some
syscalls in the default case as well as opportunistically in the pooling
case. The two cases added here are:

* A `MemFdSlot` is now created with a specified initial size. For
  pooling this is 0 but for the on-demand case this can be non-zero.

* When `instantiate` is called with no prior image and the sizes match
  (as will be the case for on-demand allocation) then `mprotect` is
  skipped entirely.

* In the `clear_and_remain-ready` case the `mprotect` is skipped if the
  heap wasn't grown at all.

This should avoid ever using `mprotect` unnecessarily and makes the
ranges we `mprotect` a bit smaller as well.

* Review comments

* Tweak allow to apply to whole crate
2022-02-02 16:09:47 -06:00
Chris Fallin
5deb1f1fbf Merge pull request #3738 from cfallin/pooling-affinity
Pooling allocator: add a reuse-affinity policy.
2022-02-02 13:11:39 -08:00