Commit Graph

10723 Commits

Author SHA1 Message Date
Andrew Brown
060f12571d wiggle: adapt Wiggle strings for shared use (#5264)
* wiggle: adapt Wiggle strings for shared use

This is an extension of #5229 for the `&str` and `&mut str` types. As
documented there, we are attempting to maintain Rust guarantees for
slices that Wiggle hands out in the presence of WebAssembly shared
memory, in which case multiple threads could be modifying the underlying
data of the slice.

This change changes the API of `GuestPtr` to return an `Option` which is
`None` when attempting to view the WebAssembly data as a string and the
underlying WebAssembly memory is shared. This reuses the
`UnsafeGuestSlice` structure from #5229 to do so and appropriately marks
the region as borrowed in Wiggle's manual borrow checker. Each original
call site in this project's WASI implementations is fixed up to `expect`
that a non-shared memory is used.  (Note that I can find no uses of
`GuestStrMut` in the WASI implementations).

* wiggle: make `GuestStr*` containers wrappers of `GuestSlice*`

This change makes it possible to reuse the underlying logic in
`UnsafeGuestSlice` and the `GuestSlice*` implementations to continue to
expose the `GuestStr` and `GuestStrMut` types. These types now are
simple wrappers of their `GuestSlice*` variant. The UTF-8 validation
that distinguished `GuestStr*` now lives in the `TryFrom`
implementations for each type.
2022-11-14 22:33:24 +00:00
Andrew Brown
7a6fbe0898 wiggle: fix typo (#5265) 2022-11-14 20:15:09 +00:00
Alex Crichton
9c73a448f5 c-api: Fix wasmtime_func_call_unchecked to communicate all errors (#5262)
Change the return value of this function to a `wasmtime_error_t*`
instead of the prior `wasm_trap_t*`. This is a leftover from #5149.

Closes #5257
2022-11-14 12:30:17 -06:00
Afonso Bordado
ff46bbaebf cranelift: Fix iadd_carry/iadd_cout in the interpreter (#5176) 2022-11-14 10:18:28 -08:00
Denys Zadorozhnyi
d3692c2f2b fix typo in caller_conv arg name in ABIMachineSpec::gen_call; (#5259) 2022-11-14 09:02:07 -08:00
Jamey Sharp
70c72ee2a4 cranelift-isle: New IR and revised overlap checks (#5195)
* cranelift-isle: New IR and revised overlap checks

* Improve error reporting

* Avoid "unused argument" warnings a nicer way

* Remove unused fields

* Minimize diff and "fix" error handling

I had tried to use Miette "right" and made things worse somehow. Among
other changes, revert all my changes to unrelated parts of `error.rs`
and `error_miette.rs`.

* Review comments: Rename "unmatchable" to "unreachable"

* Review comments: newtype wrappers, not type aliases

* Review comments: more comments on overlap checks

* Review comments: Clarify `normalize_equivalence_classes`

* Review comments: use union-find instead of linked list

This saves about 50 lines of code in the trie_again module. The
union-find implementation is about twice as long as that, counting
comments and doc-tests, but that's a worth-while tradeoff.

However, this makes `normalize_equivalence_classes` slower, because now
finding all elements of an equivalence class takes time linear in the
total size of all equivalence classes. If that ever turns out to be a
problem in practice we can find some way to optimize `remove_set_of`.

* Review comments: Hide constraints HashMap

We want to enforce that consumers of this representation can't observe
non-deterministic ordering in any of its public types.

* Review comments: Normalize equivalence classes incrementally

I'm not sure whether this is a good idea. It doesn't make the logic
particularly simpler, and I think it will do more work if three or more
binding sites with enum-variant constraints get set equal to each other.

* More comments and other clarifications

* Revert "Review comments: Normalize equivalence classes incrementally"

* Even more comments
2022-11-14 02:29:22 +00:00
Jamey Sharp
95ca72a37a cranelift-isle: Misc sema cleanups (#5242)
This mostly amounts to factoring out duplicated code and turning various
uses of `unwrap_or_continue!` into iterator chains.
2022-11-11 01:53:05 +00:00
Trevor Elliott
0367fbc2d4 cranelift: Rework pinned register lowering (#5249)
Rework pinned register lowering to avoid the use of pinned virtual registers, instead using the MovFromPReg and MovToPReg pseudo instructions.
2022-11-10 16:19:25 -08:00
Andrew Brown
7717d8fa55 wiggle: adapt Wiggle guest slices for unsafe shared use (#5229)
* wiggle: adapt Wiggle guest slices for `unsafe` shared use

When multiple threads can concurrently modify a WebAssembly shared
memory, the underlying data for a Wiggle `GuestSlice` and
`GuestSliceMut` could change due to access from other threads. This
breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out.
This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut`
return an `Option` which is `None` when the underlying WebAssembly
memory is shared.

But WASI implementations still need access to the underlying WebAssembly
memory, both to read to it and write from it. This change adds new APIs:
- `GuestPtr::to_vec` copies the  bytes from WebAssembly memory (from
  which we can safely take a `&[T]`)
- `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which
  we can  `unsafe`-ly return a mutable slice (users must accept the
  unsafety of concurrently modifying a `&mut [T]`)

This approach allows us to maintain Wiggle's borrow-checking
infrastructure, which enforces the guarantee that Wiggle will not modify
overlapping regions, e.g. This is important because the underlying
system calls may expect this. Though other threads may modify the same
underlying region, this is impossible to prevent; at least Wiggle will
not be able to do so.

Finally, the changes to Wiggle's API are propagated to all WASI
implementations in Wasmtime. For now, code locations that attempt to get
a guest slice will panic if the underlying memory is shared. Note that
Wiggle is not enabled for shared memory (that will come later in
something like #5054), but when it is, these panics will be clear
indicators of locations that must be re-implemented in a thread-safe
way.

* review: remove double cast

* review: refactor to include more logic in 'UnsafeGuestSlice'

* review: add reference to #4203

* review: link all thread-safe WASI fixups to #5235

* fix: consume 'UnsafeGuestSlice' during conversion to safe versions

* review: remove 'as_slice' and 'as_slice_mut'

* review: use 'as_unsafe_slice_mut' in 'to_vec'

* review: add `UnsafeBorrowResult`
2022-11-10 21:54:52 +00:00
Alex Crichton
0548952319 Update wasm-tools crates (#5248)
No major updates, just keeping up-to-date.
2022-11-10 21:23:20 +00:00
Alex Crichton
7ec626b898 Use deterministic randomness fuzzing the pooling allocator (#5247)
This commit updates the index allocation performed in the pooling
allocator with a few refactorings:

* With `cfg(fuzzing)` a deterministic rng is now used to improve
  reproducibility of fuzz test cases.
* The `Mutex` was pushed inside of `IndexAllocator`, renamed from
  `PoolingAllocationState`.
* Randomness is now always done through a `SmallRng` stored in the
  `IndexAllocator` instead of using `thread_rng`.
* The `is_empty` method has been removed in favor of an `Option`-based
  return on `alloc`.

This refactoring is additionally intended to encapsulate more
implementation details of `IndexAllocator` to more easily allow for
alternate implementations in the future such as lock-free approaches
(possibly).
2022-11-10 20:53:04 +00:00
Peter Huene
42e88c7b24 Fix OutOfFuel trap code not represented in the C API. (#5230)
This commit adds the missing "out of fuel" trap code to the C API.

Without this, calls to `wasmtime_trap_code` will trigger an unreachable panic
on traps from running out of fuel.
2022-11-10 20:42:26 +00:00
Alex Crichton
3b9668558f winch: Prepare for an update to the wasm-tools crates (#5238)
This commit prepares the `winch` crate for updating `wasm-tools`,
notably changing a bit about how the visitation of operators works. This
moves the function body and wasm validator out of the `CodeGen`
structure and into parameters threaded into the emission of the actual
function.

Additionally the `VisitOperator` implementation was updated to remove
the explicit calls to the validator, favoring instead a macro-generated
solution to guarantee that all validation happens before any translation
proceeds. This means that the `VisitOperator for CodeGen` impl is now
infallible and the various methods have been inlined into the trait
methods as well as removing the `Result<_>`.

Finally this commit updates translation to call `validator.finish(..)`
which is required to perform the final validation steps of the function
body.
2022-11-10 14:01:42 -06:00
Alex Crichton
1f09954fa4 Avoid unconditional getrandom syscall creating a WasiCtx (#5244)
This commit updates the default random context inserted into a
`WasiCtxt` to be seeded from `thread_rng` rather than the system's
entropy. This avoids an unconditional syscall on the creation of all
`WasiCtx` structures shouldn't reduce the quality of the random numbers
produced.
2022-11-10 13:58:11 -06:00
Alex Crichton
92f6fe36cc Fix CI after CVE fixes (#5245)
* Fix CI after CVE fixes

Alas we can't run CI ahead of time so this fixes various minor build
issues from the merging of the recent CVE fixes. Note that I plan to
publish the advisories once CI issues are sorted out.

* Fix mmap/free of zero bytes
2022-11-10 13:35:15 -06:00
Jamey Sharp
f0fccbd18a cranelift-isle: Helpers to get type/term by name (#5241)
This is a common pattern in sema, so factor it out.

Since this version uses `intern` instead of `intern_mut`, it might be a
tiny bit faster when errors occur due to not writing names into maps
then. When no error occurs, ISLE should do exactly the same work with or
without this commit.
2022-11-10 09:51:49 -08:00
Alex Crichton
2be457c295 Change the return type of SharedMemory::data (#5240)
This commit is an attempt at improving the safety of using the return
value of the `SharedMemory::data` method. Previously this returned
`*mut [u8]` which, while correct, is unwieldy and unsafe to work with.
The new return value of `&[UnsafeCell<u8>]` has a few advantages:

* The lifetime of the returned data is now connected to the
  `SharedMemory` itself, removing the possibility for a class of errors
  of accidentally using the prior `*mut [u8]` beyond its original lifetime.

* It's not possibly to safely access `.len()` as opposed to requiring an
  `unsafe` dereference before.

* The data internally within the slice is now what retains the `unsafe`
  bits, namely indicating that accessing any memory inside of the
  contents returned is `unsafe` but addressing it is safe.

I was inspired by the `wiggle`-based discussion on #5229 and felt it
appropriate to apply a similar change here.
2022-11-10 09:51:10 -08:00
Alex Crichton
5b6d5e78de Merge pull request from GHSA-h84q-m8rr-3v9q
The Rust definition was previously performing a 4-byte write when the C
API was declared as taking an 1-byte buffer.
2022-11-10 11:35:14 -06:00
Alex Crichton
000bd98ae5 Merge pull request from GHSA-44mr-8vmm-wjhg
Ensure that support is not regressed in keeping this working.
2022-11-10 11:34:59 -06:00
Alex Crichton
3535acbf3b Merge pull request from GHSA-wh6w-3828-g9qf
* Unconditionally use `MemoryImageSlot`

This commit removes the internal branching within the pooling instance
allocator to sometimes use a `MemoryImageSlot` and sometimes now.
Instead this is now unconditionally used in all situations on all
platforms. This fixes an issue where the state of a slot could get
corrupted if modules being instantiated switched from having images to
not having an image or vice versa.

The bulk of this commit is the removal of the `memory-init-cow`
compile-time feature in addition to adding Windows support to the
`cow.rs` file.

* Fix compile on Unix

* Add a stricter assertion for static memory bounds

Double-check that when a memory is allocated the configuration required
is satisfied by the pooling allocator.
2022-11-10 11:34:38 -06:00
Nick Fitzgerald
47fa1ad6a8 Rework bounds checking for atomic operations (#5239)
Before, we would do a `heap_addr` to translate the given Wasm memory address
into a native memory address and pass it into the libcall that implemented the
atomic operation, which would then treat the address as a Wasm memory address
and pass it to `validate_atomic_addr` to be bounds checked a second time. This
is a bit nonsensical, as we are validating a native memory address as if it were
a Wasm memory address.

Now, we no longer do a `heap_addr` to translate the Wasm memory address to a
native memory address. Instead, we pass the Wasm memory address to the libcall,
and the libcall is responsible for doing the bounds check (by calling
`validate_atomic_addr` with the correct type of memory address now).
2022-11-09 16:19:43 -08:00
Jamey Sharp
86679489ef cranelift-isle: if-let patterns aren't root terms (#5233)
The `is_root` flag to `translate_pattern` just determines whether the
`rule_term` argument is used, which begs a larger cleanup. But that
cleanup is less clear if `is_root` is set anywhere aside from the call
in `collect_rules`. So I wanted to get confirmation that this particular
use of that flag is incorrect first.

These two arguments (`is_root` and `rule_term`) are used to prevent
expansion of a term as an internal extractor ("macro") if:
- that term is also an internal constructor
- and it's the root term on the left-hand side of the current rule
- and the pattern we're currently translating has no parents.

I'm not sure what it should mean to use the term you're currently
defining as the root pattern on the left-hand side of an if-let in the
same rule, but I don't think it should have this particular special
treatment.
2022-11-09 15:32:33 -08:00
Jamey Sharp
54998715ea cranelift-isle: Save variable names for later use (#5221)
It's nice to be able to report these names after sema analysis completes
so rule authors can recognize which names they used.

This isn't used anywhere yet, but I'm planning to use it during codegen,
and the rule-verification folks wanted something like this for debugging
output.
2022-11-09 15:21:15 -08:00
Jamey Sharp
d38631a724 cranelift-isle: Don't panic on too-large rule priorities (#5236)
Found with ISLE's fuzzer.
2022-11-09 20:36:02 +00:00
Nick Fitzgerald
fc62d4ad65 Cranelift: Make heap_addr return calculated base + index + offset (#5231)
* Cranelift: Make `heap_addr` return calculated `base + index + offset`

Rather than return just the `base + index`.

(Note: I've chosen to use the nomenclature "index" for the dynamic operand and
"offset" for the static immediate.)

This move the addition of the `offset` into `heap_addr`, instead of leaving it
for the subsequent memory operation, so that we can Spectre-guard the full
address, and not allow speculative execution to read the first 4GiB of memory.

Before this commit, we were effectively doing

    load(spectre_guard(base + index) + offset)

Now we are effectively doing

    load(spectre_guard(base + index + offset))

Finally, this also corrects `heap_addr`'s documented semantics to say that it
returns an address that will trap on access if `index + offset + access_size` is
out of bounds for the given heap, rather than saying that the `heap_addr` itself
will trap. This matches the implemented behavior for static memories, and after
https://github.com/bytecodealliance/wasmtime/pull/5190 lands (which is blocked
on this commit) will also match the implemented behavior for dynamic memories.

* Update heap_addr docs

* Factor out `offset + size` to a helper
2022-11-09 19:53:51 +00:00
Jamey Sharp
33a192556e cranelift-isle: Do fewer term lookups (#5232)
While checking the call graph of extractors during semantic validation,
save `TermId` instead of `Sym`. The types are both just integer indexes,
but the `TermId` is more useful here. Saving it avoids needing to check
for failed map lookups twice, which simplifies the implementation.
2022-11-09 11:24:38 -08:00
Tshepang Mbambo
065ce74591 cli docs: some consistency improvements, and a fix (#5234) 2022-11-09 09:13:03 -06:00
Trevor Elliott
b077854b57 Generate SSA code from returns (#5172)
Modify return pseudo-instructions to have pairs of registers: virtual and real. This allows us to constrain the virtual registers to the real ones specified by the abi, instead of directly emitting moves to those real registers.
2022-11-08 16:00:49 -08:00
Chris Fallin
d59caf39b6 Wasmtime+Cranelift: strip out some dead x86-32 code. (#5226)
* Wasmtime+Cranelift: strip out some dead x86-32 code.

I was recently pointed to fastly/Viceroy#200 where it seems some folks
are trying to compile Wasmtime (via Viceroy) for Windows x86-32 and the
failures may not be loud enough. I've tried to reproduce this
cross-compiling to i686-pc-windows-gnu from Linux and hit build failures
(as expected) in several places.  Nevertheless, while trying to discern
what others may be attempting, I noticed some dead x86-32-specific code
in our repo, and figured it would be a good idea to clean this up.
Otherwise, it (i) sends some mixed messages -- "hey look, this codebase
does support x86-32" -- and (ii) keeps untested code around, which is
generally not great.

This PR removes x86-32-specific cases in traphandlers and unwind code,
and Cranelift's native feature detection. It adds helpful compile-error
messages in a few cases. If we ever support x86-32 (contributors
welcome! The big missing piece is Cranelift support; see #1980), these
compile errors and git history should be enough to recover any knowledge
we are now encoding in the source.

I left the x86-32 support in `wasmtime-fiber` alone because that seems
like a bit of a special case -- foundation library, separate from the
rest of Wasmtime, with specific care to provide a (presumably working)
full 32-bit version.

* Remove some extraneous compile_error!s, already covered by others.
2022-11-08 23:03:17 +00:00
Nick Fitzgerald
fd7b903f33 Cranelift: Use a custom enum instead of boolean for the ISLE target (#5228)
Easier to read and doesn't require `/* is_lower = */`-style comments at call
sites.
2022-11-08 21:44:02 +00:00
Andrew Brown
f026d95a1a wiggle: add initial support for shared memory (#5225)
This change is the first in a series of changes to support shared memory
in Wiggle. Since Wiggle was written under the assumption of
single-threaded guest-side access, this change introduces a `shared`
field to guest memories in order to flag when this assumption will not
be the case. This change always sets `shared` to `false`; once a few
more pieces are in place, `shared` will be set dynamically when a shared
memory is detected, e.g., in a change like #5054.

Using the `shared` field, we can now decide to load Wiggle values
differently under the new assumptions. This change  makes the guest
`T::read` and `T::write` calls into `Relaxed` atomic loads and stores in
order to maintain WebAssembly's expected memory consistency guarantees.
We choose Rust's `Relaxed` here to match the `Unordered` memory
consistency described in the [memory model] section of the ECMA spec.
These relaxed accesses are done unconditionally, since we theorize that
the performance benefit of an additional branch vs a relaxed load is
not much.

[memory model]: https://tc39.es/ecma262/multipage/memory-model.html#sec-memory-model

Since 128-bit scalar types do not have `Atomic*` equivalents, we remove
their `T::read` and `T::write` implementations here. They are unused by
any WASI implementations in the project.
2022-11-08 13:25:24 -08:00
Alex Crichton
50cffad0d3 Implement support for dynamic memories in the pooling allocator (#5208)
* Implement support for dynamic memories in the pooling allocator

This is a continuation of the thrust in #5207 for reducing page faults
and lock contention when using the pooling allocator. To that end this
commit implements support for efficient memory management in the pooling
allocator when using wasm that is instrumented with bounds checks.

The `MemoryImageSlot` type now avoids unconditionally shrinking memory
back to its initial size during the `clear_and_remain_ready` operation,
instead deferring optional resizing of memory to the subsequent call to
`instantiate` when the slot is reused. The instantiation portion then
takes the "memory style" as an argument which dictates whether the
accessible memory must be precisely fit or whether it's allowed to
exceed the maximum. This in effect enables skipping a call to `mprotect`
to shrink the heap when dynamic memory checks are enabled.

In terms of page fault and contention this should improve the situation
by:

* Fewer calls to `mprotect` since once a heap grows it stays grown and
  it never shrinks. This means that a write lock is taken within the
  kernel much more rarely from before (only asymptotically now, not
  N-times-per-instance).

* Accessed memory after a heap growth operation will not fault if it was
  previously paged in by a prior instance and set to zero with `memset`.
  Unlike #5207 which requires a 6.0 kernel to see this optimization this
  commit enables the optimization for any kernel.

The major cost of choosing this strategy is naturally the performance
hit of the wasm itself. This is being looked at in PRs such as #5190 to
improve Wasmtime's story here.

This commit does not implement any new configuration options for
Wasmtime but instead reinterprets existing configuration options. The
pooling allocator no longer unconditionally sets
`static_memory_bound_is_maximum` and then implements support necessary
for this memory type. This other change to this commit is that the
`Tunables::static_memory_bound` configuration option is no longer gating
on the creation of a `MemoryPool` and it will now appropriately size to
`instance_limits.memory_pages` if the `static_memory_bound` is to small.
This is done to accomodate fuzzing more easily where the
`static_memory_bound` will become small during fuzzing and otherwise the
configuration would be rejected and require manual handling. The spirit
of the `MemoryPool` is one of large virtual address space reservations
anyway so it seemed reasonable to interpret the configuration this way.

* Skip zero memory_size cases

These are causing errors to happen when fuzzing and otherwise in theory
shouldn't be too interesting to optimize for anyway since they likely
aren't used in practice.
2022-11-08 14:43:08 -06:00
Trevor Elliott
70bca801ab cranelift: Resize with types::INVALID isntead of types::I8 (#5227) 2022-11-08 20:42:20 +00:00
Trevor Elliott
d94173ea09 Add a VRegAllocator to separate VReg allocation from VCode (#5222)
Remove the dependency on VCode for VReg allocation. This will simplify the changes in #5172, as that PR introduces the need to allocate temporary registers from the ABI context.

This change also allows us to remove some fields from VCode: reftyped_vregs_set and have_ref_values.
2022-11-08 10:05:02 -08:00
Alex Crichton
7b5fd84082 c-api: Avoid losing error context with instance traps (#5223)
This commit was a mistake from #5149
2022-11-08 11:43:20 -06:00
Ulrich Weigand
3e5938e65a Support big- and little-endian lane order with bitcast (#5196)
Add a MemFlags operand to the bitcast instruction, where only the
`big` and `little` flags are accepted.  These define the lane order
to be used when casting between types of different lane counts.

Update all users to pass an appropriate MemFlags argument.

Implement lane swaps where necessary in the s390x back-end.

This is the final part necessary to fix
https://github.com/bytecodealliance/wasmtime/issues/4566.
2022-11-07 14:41:10 -08:00
Alex Crichton
5cef53537b Add release notes for 3.0.0 (#5213) 2022-11-07 13:30:07 -06:00
Alex Crichton
b07b0676a3 Update how exits are modeled in the C API (#5215)
Previously extracting an exit code was only possibly on a `wasm_trap_t`
which will never successfully have an exit code on it, so the exit code
extractor is moved over to `wasmtime_error_t`. Additionally extracting a
wasm trace from a `wasmtime_error_t` is added since traces happen on
both traps and errors now.
2022-11-07 11:35:49 -06:00
Alex Crichton
980e948239 Slim down temporary trampoline objects (#5212)
I noticed this in the backtrace of something that timed out on oss-fuzz
and there's no need to include this information in trampolines, so this
removes the extra sections from being generated.
2022-11-07 11:28:17 -06:00
Afonso Bordado
9814e8bfeb fuzzgen: Add a few more ops (#5201)
Adds `bitselect`,`select` and `select_spectre_guard`
2022-11-07 09:08:26 -08:00
Alphyr
508dd81928 Impl Debug for SharedMemory and Extern (#5211) 2022-11-07 09:05:59 -06:00
wasmtime-publish
08ef518c95 Bump Wasmtime to 4.0.0 (#5209)
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
2022-11-06 13:32:34 -06:00
Joe Shaw
1ddf03aaa1 offer function-level control over tracing (#5194)
* wiggle: fix compilation with async functions when tracing is off

Fixes #5202

* switch tracing config from a boolean to a struct

This will enable more complex tracing rules in the future

* rename AsyncConfField to FunctionField

It is going to be reused for cases other than just async functions

* add support for disabling tracing per-function

This adds a `disable_for` syntax after the `tracing` boolean.  For
example:

```
wiggle::from_witx!(
    tracing: true disable_for {
        module1::foo,
        module2::{bar, baz},
    }
)
```
2022-11-05 11:31:09 -07:00
Ulrich Weigand
fba2287c54 Fix mprotect failures by enabling cranelift-jit selinux-fix (#5204)
The sample program in cranelift/filetests/src/function_runner.rs
would abort with an mprotect failure under certain circumstances,
see https://github.com/bytecodealliance/wasmtime/pull/4453#issuecomment-1303803222

Root cause was that enabling PROT_EXEC on the main process heap
may be prohibited, depending on Linux distro and version.

This only shows up in the doc test sample program because the main
clif-util is multi-threaded and therefore allocations will happen
on glibc's per-thread heap, which is allocated via mmap, and not
the main process heap.

Work around the problem by enabling the "selinux-fix" feature of
the cranelift-jit crate dependency in the filetests.  Note that
this didn't compile out of the box, so a separate fix is also
required and provided as part of this PR.

Going forward, it would be preferable to always use mmap to allocate
the backing memory for JITted code.
2022-11-04 14:01:37 -07:00
Alex Crichton
d3a6181939 Add support for keeping pooling allocator pages resident (#5207)
When new wasm instances are created repeatedly in high-concurrency
environments one of the largest bottlenecks is the contention on
kernel-level locks having to do with the virtual memory. It's expected
that usage in this environment is leveraging the pooling instance
allocator with the `memory-init-cow` feature enabled which means that
the kernel level VM lock is acquired in operations such as:

1. Growing a heap with `mprotect` (write lock)
2. Faulting in memory during usage (read lock)
3. Resetting a heap's contents with `madvise` (read lock)
4. Shrinking a heap with `mprotect` when reusing a slot (write lock)

Rapid usage of these operations can lead to detrimental performance
especially on otherwise heavily loaded systems, worsening the more
frequent the above operations are. This commit is aimed at addressing
the (2) case above, reducing the number of page faults that are
fulfilled by the kernel.

Currently these page faults happen for three reasons:

* When memory is first accessed after the heap is grown.
* When the initial linear memory image is accessed for the first time.
* When the initial zero'd heap contents, not part of the linear memory
  image, are accessed.

This PR is attempting to address the latter of these cases, and to a
lesser extent the first case as well. Specifically this PR provides the
ability to partially reset a pooled linear memory with `memset` rather
than `madvise`. This is done to have the same effect of resetting
contents to zero but namely has a different effect on paging, notably
keeping the pages resident in memory rather than returning them to the
kernel. This means that reuse of a linear memory slot on a page that was
previously `memset` will not trigger a page fault since everything
remains paged into the process.

The end result is that any access to linear memory which has been
touched by `memset` will no longer page fault on reuse. On more recent
kernels (6.0+) this also means pages which were zero'd by `memset`, made
inaccessible with `PROT_NONE`, and then made accessible again with
`PROT_READ | PROT_WRITE` will not page fault. This can be common when a
wasm instances grows its heap slightly, uses that memory, but then it's
shrunk when the memory is reused for the next instance. Note that this
kernel optimization requires a 6.0+ kernel.

This same optimization is furthermore applied to both async stacks with
the pooling memory allocator in addition to table elements. The defaults
of Wasmtime are not changing with this PR, instead knobs are being
exposed for embedders to turn if they so desire. This is currently being
experimented with at Fastly and I may come back and alter the defaults
of Wasmtime if it seems suitable after our measurements.
2022-11-04 20:56:34 +00:00
Alex Crichton
b14551d7ca Refactor configuration for the pooling allocator (#5205)
This commit changes the APIs in the `wasmtime` crate for configuring the
pooling allocator. I plan on adding a few more configuration options in
the near future and the current structure was feeling unwieldy for
adding these new abstractions.

The previous `struct`-based API has been replaced with a builder-style
API in a similar shape as to `Config`. This is done to help make it
easier to add more configuration options in the future through adding
more methods as opposed to adding more field which could break prior
initializations.
2022-11-04 20:06:45 +00:00
Joe Shaw
7b7eeac1be wiggle: fix compilation with async functions when tracing is off (#5203)
Fixes #5202
2022-11-04 11:43:00 -07:00
11evan
387426e7f4 cranelift: improve syscall error/oom handling in JIT module (#5173)
* cranelift: improve syscall error/oom handling in JIT module

The JIT module has several places where it `expect`s or `panic`s
on syscall or allocator errors. For example, `mmap` and `mprotect`
can fail if Linux `vm.max_map_count` is not high enough, and some
users may wish to handle this error rather than immediately
crashing.

This commit plumbs these errors upward as new `ModuleError`
types, so that callers of jit module functions like
`finalize_definitions` and `define_function` can handle them
(or just `unwrap()`, as desired).

* cranelift: Remove ModuleError::Syscall variant

Syscall errors can just be folded into the generic Backend error,
which is an anyhow::Error

* cranelift-jit: return io::ErrorKind::OutOfMemory for alloc failure

Just using `io::Error::last_os_error()` is not correct as global
allocator impls are not required to set errno
2022-11-03 16:59:41 -07:00
Johnnie Birch
5285ba15b1 Update format of benchmark results (#5060)
* Update format of benchmark results

* Use default formatted sightglass output
2022-11-03 13:54:17 -07:00
Ulrich Weigand
342f805812 Use vselect in NaN canonicalization pass. (#5192)
Change add_nan_canon_seq to use vselect instead of bitselect.
This is more straightforward and removes bitcast operations.
Codegen should be unchanged.
2022-11-03 20:36:38 +00:00