Commit Graph

157 Commits

Author SHA1 Message Date
Alex Crichton
c73be1f13a Use an mmap-friendly serialization format (#3257)
* Use an mmap-friendly serialization format

This commit reimplements the main serialization format for Wasmtime's
precompiled artifacts. Previously they were generally a binary blob of
`bincode`-encoded metadata prefixed with some versioning information.
The downside of this format, though, is that loading a precompiled
artifact required pushing all information through `bincode`. This is
inefficient when some data, such as trap/address tables, are rarely
accessed.

The new format added in this commit is one which is designed to be
`mmap`-friendly. This means that the relevant parts of the precompiled
artifact are already page-aligned for updating permissions of pieces
here and there. Additionally the artifact is optimized so that if data
is rarely read then we can delay reading it until necessary.

The new artifact format for serialized modules is an ELF file. This is
not a public API guarantee, so it cannot be relied upon. In the meantime
though this is quite useful for exploring precompiled modules with
standard tooling like `objdump`. The ELF file is already constructed as
part of module compilation, and this is the main contents of the
serialized artifact.

THere is some extra information, though, not encoded in each module's
individual ELF file such as type information. This information continues
to be `bincode`-encoded, but it's intended to be much smaller and much
faster to deserialize. This extra information is appended to the end of
the ELF file. This means that the original ELF file is still a valid ELF
file, we just get to have extra bits at the end. More information on the
new format can be found in the module docs of the serialization module
of Wasmtime.

Another refatoring implemented as part of this commit is to deserialize
and store object files directly in `mmap`-backed storage. This avoids
the need to copy bytes after the artifact is loaded into memory for each
compiled module, and in a future commit it opens up the door to avoiding
copying the text section into a `CodeMemory`. For now, though, the main
change is that copies are not necessary when loading from a precompiled
compilation artifact once the artifact is itself in mmap-based memory.

To assist with managing `mmap`-based memory a new `MmapVec` type was
added to `wasmtime_jit` which acts as a form of `Vec<T>` backed by a
`wasmtime_runtime::Mmap`. This type notably supports `drain(..N)` to
slice the buffer into disjoint regions that are all separately owned,
such as having a separately owned window into one artifact for all
object files contained within.

Finally this commit implements a small refactoring in `wasmtime-cache`
to use the standard artifact format for cache entries rather than a
bincode-encoded version. This required some more hooks for
serializing/deserializing but otherwise the crate still performs as
before.

* Review comments
2021-08-30 09:19:20 -05:00
Peter Huene
e2b9b54301 Add paged_memory_initialization to Config.
This commit adds a `paged_memory_initialization` setting to `Config`.

The setting controls whether or not an attempt is made to organize data
segments into Wasm pages during compilation.

When used in conjunction with the `uffd` feature on Linux, Wasmtime can
completely skip initializing linear memories and instead initialize any pages
that are accessed for the first time during Wasm execution.
2021-08-26 16:56:38 -07:00
Alex Crichton
da5c82b786 Fix a possible use-after-free introduced in #3231 (#3238)
In #3231 the wasm data sections were moved from the
`wasmtime_environ::Module` structure into the `CompilationArtifacts`.
Each `wasmtime_runtime::Instance` holds raw pointers into the data
section owned by the compilation artifacts under the assumption that the
runtime keeps the artifacts alive while the module is in use. Data is
needed beyond original initialization for `memory.init` instructions as
well as lazy-initialization with the `uffd` feature.

The intention of #3231 was that all `CompiledModule` structures, which
own `CompilationArtifacts` were owned by a store's `ModuleRegistry`, so
this was already taken care of. It turns out, however, that empty
modules which contain no functions are not held within a
`ModuleRegistry` since there was no need prior to retain them. This
commit remedies this mistake by retaining the `CompiledModule`
structure, even if there aren't any functions compiled in.

This should unblock #3235 and fixes the spurious error found there. The
test here, at least on Linux, will deterministically reproduce the error
before this commit since `uffd` was initializing wasm memory with free'd
host memory.
2021-08-25 12:14:13 -05:00
Alex Crichton
f3977f1d97 Fix determinism of compiled modules (#3229)
* Fix determinism of compiled modules

Currently wasmtime's compilation artifacts are not deterministic due to
the usage of `HashMap` during serialization which has randomized order
of its elements. This commit fixes that by switching to a sorted
`BTreeMap` for various maps. A test is also added to ensure determinism.

If in the future the performance of `BTreeMap` is not as good as
`HashMap` for some of these cases we can implement a fancier
`serialize_with`-style solution where we sort keys during serialization,
but only during serialization and otherwise use a `HashMap`.

* fix lightbeam
2021-08-23 17:08:19 -05:00
Alex Crichton
f5041dd362 Implement a setting for reserved dynamic memory growth (#3215)
* Implement a setting for reserved dynamic memory growth

Dynamic memories aren't really that heavily used in Wasmtime right now
because for most 32-bit memories they're classified as "static" which
means they reserve 4gb of address space and never move. Growth of a
static memory is simply making pages accessible, so it's quite fast.

With the memory64 feature, however, this is no longer true since all
memory64 memories are classified as "dynamic" at this time. Previous to
this commit growth of a dynamic memory unconditionally moved the entire
linear memory in the host's address space, always resulting in a new
`Mmap` allocation. This behavior is causing fuzzers to time out when
working with 64-bit memories because incrementally growing a memory by 1
page at a time can incur a quadratic time complexity as bytes are
constantly moved.

This commit implements a scheme where there is now a tunable setting for
memory to be reserved at the end of a dynamic memory to grow into. This
means that dynamic memory growth is ideally amortized as most calls to
`memory.grow` will be able to grow into the pre-reserved space. Some
calls, though, will still need to copy the memory around.

This helps enable a commented out test for 64-bit memories now that it's
fast enough to run in debug mode. This is because the growth of memory
in the test no longer needs to copy 4gb of zeros.

* Test fixes & review comments

* More comments
2021-08-20 10:54:23 -05:00
Alex Crichton
e8aa7bb53b Reimplement how unwind information is stored (#3180)
* Reimplement how unwind information is stored

This commit is a major refactoring of how unwind information is stored
after compilation of a function has finished. Previously we would store
the raw `UnwindInfo` as a result of compilation and this would get
serialized/deserialized alongside the rest of the ELF object that
compilation creates. Whenever functions were registered with
`CodeMemory` this would also result in registering unwinding information
dynamically at runtime, which in the case of Unix, for example, would
dynamically created FDE/CIE entries on-the-fly.

Eventually I'd like to support compiling Wasmtime without Cranelift, but
this means that `UnwindInfo` wouldn't be easily available to decode into
and create unwinding information from. To solve this I've changed the
ELF object created to have the unwinding information encoded into it
ahead-of-time so loading code into memory no longer needs to create
unwinding tables. This change has two different implementations for
Windows/Unix:

* On Windows the implementation was much easier. The unwinding
  information on Windows is already stored after the function itself in
  the text section. This was actually slightly duplicated in object
  building and in code memory allocation. Now the object building
  continues to do the same, recording unwinding information after
  functions, and code memory no longer manually tracks this.
  Additionally Wasmtime will emit a special custom section in the object
  file with unwinding information which is the list of
  `RUNTIME_FUNCTION` structures that `RtlAddFunctionTable` expects. This
  means that the object file has all the information precompiled into it
  and registration at runtime is simply passing a few pointers around to
  the runtime.

* Unix was a little bit more difficult than Windows. Today a `.eh_frame`
  section is created on-the-fly with offsets in FDEs specified as the
  absolute address that functions are loaded at. This absolute
  address hindered the ability to precompile the FDE into the object
  file itself. I've switched how addresses are encoded, though, to using
  `DW_EH_PE_pcrel` which means that FDE addresses are now specified
  relative to the FDE itself. This means that we can maintain a fixed
  offset between the `.eh_frame` loaded in memory and the beginning of
  code memory. When doing so this enables precompiling the `.eh_frame`
  section into the object file and at runtime when loading an object no
  further construction of unwinding information is needed.

The overall result of this commit is that unwinding information is no
longer stored in its cranelift-data-structure form on disk. This means
that this unwinding information format is only present during
compilation, which will make it that much easier to compile out
cranelift in the future.

This commit also significantly refactors `CodeMemory` since the way
unwinding information is handled is not much different from before.
Previously `CodeMemory` was suitable for incrementally adding more and
more functions to it, but nowadays a `CodeMemory` either lives per
module (in which case all functions are known up front) or it's created
once-per-`Func::new` with two trampolines. In both cases we know all
functions up front so the functionality of incrementally adding more and
more segments is no longer needed. This commit removes the ability to
add a function-at-a-time in `CodeMemory` and instead it can now only
load objects in their entirety. A small helper function is added to
build a small object file for trampolines in `Func::new` to handle
allocation there.

Finally, this commit also folds the `wasmtime-obj` crate directly into
the `wasmtime-cranelift` crate and its builder structure to be more
amenable to this strategy of managing unwinding tables.

It is not intentional to have any real functional change as a result of
this commit. This might accelerate loading a module from cache slightly
since less work is needed to manage the unwinding information, but
that's just a side benefit from the main goal of this commit which is to
remove the dependence on cranelift unwinding information being available
at runtime.

* Remove isa reexport from wasmtime-environ

* Trim down reexports of `cranelift-codegen`

Remove everything non-essential so that only the bits which will need to
be refactored out of cranelift remain.

* Fix debug tests

* Review comments
2021-08-17 17:14:18 -05:00
Alex Crichton
bd47a74dab Always call the resource limiter for memory allocations (#3189)
* Always call the resource limiter for memory allocations

Previously the memory64 support meant that sometimes we wouldn't call
the limiter because the calculation for the minimum size requested would
overflow. Instead Wasmtime now wraps the minimum size in something a bit
smaller than the address space to inform the limiter, which should
guarantee that although the limiter is called with "incorrect"
information it's effectively correct and is allowed a pass to learn that
a massive memory was requested.

This was found by the fuzzers where a request for the absolute maximal
size of 64-bit memory (e.g. the entire 64-bit address space) didn't
actually invoke the limiter which means that we mis-classified an
instantiation error and didn't realize that it was an OOM.

* Add a test
2021-08-16 12:49:56 -05:00
Alex Crichton
0313e30d76 Remove dependency on TargetIsa from Wasmtime crates (#3178)
This commit started off by deleting the `cranelift_codegen::settings`
reexport in the `wasmtime-environ` crate and then basically played
whack-a-mole until everything compiled again. The main result of this is
that the `wasmtime-*` family of crates have generally less of a
dependency on the `TargetIsa` trait and type from Cranelift. While the
dependency isn't entirely severed yet this is at least a significant
start.

This commit is intended to be largely refactorings, no functional
changes are intended here. The refactorings are:

* A `CompilerBuilder` trait has been added to `wasmtime_environ` which
  server as an abstraction used to create compilers and configure them
  in a uniform fashion. The `wasmtime::Config` type now uses this
  instead of cranelift-specific settings. The `wasmtime-jit` crate
  exports the ability to create a compiler builder from a
  `CompilationStrategy`, which only works for Cranelift right now. In a
  cranelift-less build of Wasmtime this is expected to return a trait
  object that fails all requests to compile.

* The `Compiler` trait in the `wasmtime_environ` crate has been souped
  up with a number of methods that Wasmtime and other crates needed.

* The `wasmtime-debug` crate is now moved entirely behind the
  `wasmtime-cranelift` crate.

* The `wasmtime-cranelift` crate is now only depended on by the
  `wasmtime-jit` crate.

* Wasm types in `cranelift-wasm` no longer contain their IR type,
  instead they only contain the `WasmType`. This is required to get
  everything to align correctly but will also be required in a future
  refactoring where the types used by `cranelift-wasm` will be extracted
  to a separate crate.

* I moved around a fair bit of code in `wasmtime-cranelift`.

* Some gdb-specific jit-specific code has moved from `wasmtime-debug` to
  `wasmtime-jit`.
2021-08-16 09:55:39 -05:00
Alex Crichton
e68aa99588 Implement the memory64 proposal in Wasmtime (#3153)
* Implement the memory64 proposal in Wasmtime

This commit implements the WebAssembly [memory64 proposal][proposal] in
both Wasmtime and Cranelift. In terms of work done Cranelift ended up
needing very little work here since most of it was already prepared for
64-bit memories at one point or another. Most of the work in Wasmtime is
largely refactoring, changing a bunch of `u32` values to something else.

A number of internal and public interfaces are changing as a result of
this commit, for example:

* Acessors on `wasmtime::Memory` that work with pages now all return
  `u64` unconditionally rather than `u32`. This makes it possible to
  accommodate 64-bit memories with this API, but we may also want to
  consider `usize` here at some point since the host can't grow past
  `usize`-limited pages anyway.

* The `wasmtime::Limits` structure is removed in favor of
  minimum/maximum methods on table/memory types.

* Many libcall intrinsics called by jit code now unconditionally take
  `u64` arguments instead of `u32`. Return values are `usize`, however,
  since the return value, if successful, is always bounded by host
  memory while arguments can come from any guest.

* The `heap_addr` clif instruction now takes a 64-bit offset argument
  instead of a 32-bit one. It turns out that the legalization of
  `heap_addr` already worked with 64-bit offsets, so this change was
  fairly trivial to make.

* The runtime implementation of mmap-based linear memories has changed
  to largely work in `usize` quantities in its API and in bytes instead
  of pages. This simplifies various aspects and reflects that
  mmap-memories are always bound by `usize` since that's what the host
  is using to address things, and additionally most calculations care
  about bytes rather than pages except for the very edge where we're
  going to/from wasm.

Overall I've tried to minimize the amount of `as` casts as possible,
using checked `try_from` and checked arithemtic with either error
handling or explicit `unwrap()` calls to tell us about bugs in the
future. Most locations have relatively obvious things to do with various
implications on various hosts, and I think they should all be roughly of
the right shape but time will tell. I mostly relied on the compiler
complaining that various types weren't aligned to figure out
type-casting, and I manually audited some of the more obvious locations.
I suspect we have a number of hidden locations that will panic on 32-bit
hosts if 64-bit modules try to run there, but otherwise I think we
should be generally ok (famous last words). In any case I wouldn't want
to enable this by default naturally until we've fuzzed it for some time.

In terms of the actual underlying implementation, no one should expect
memory64 to be all that fast. Right now it's implemented with
"dynamic" heaps which have a few consequences:

* All memory accesses are bounds-checked. I'm not sure how aggressively
  Cranelift tries to optimize out bounds checks, but I suspect not a ton
  since we haven't stressed this much historically.

* Heaps are always precisely sized. This means that every call to
  `memory.grow` will incur a `memcpy` of memory from the old heap to the
  new. We probably want to at least look into `mremap` on Linux and
  otherwise try to implement schemes where dynamic heaps have some
  reserved pages to grow into to help amortize the cost of
  `memory.grow`.

The memory64 spec test suite is scheduled to now run on CI, but as with
all the other spec test suites it's really not all that comprehensive.
I've tried adding more tests for basic things as I've had to implement
guards for them, but I wouldn't really consider the testing adequate
from just this PR itself. I did try to take care in one test to actually
allocate a 4gb+ heap and then avoid running that in the pooling
allocator or in emulation because otherwise that may fail or take
excessively long.

[proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

* Fix some tests

* More test fixes

* Fix wasmtime tests

* Fix doctests

* Revert to 32-bit immediate offsets in `heap_addr`

This commit updates the generation of addresses in wasm code to always
use 32-bit offsets for `heap_addr`, and if the calculated offset is
bigger than 32-bits we emit a manual add with an overflow check.

* Disable memory64 for spectest fuzzing

* Fix wrong offset being added to heap addr

* More comments!

* Clarify bytes/pages
2021-08-12 09:40:20 -05:00
Alex Crichton
c6b095f9a3 cranelift: Implement nan canonicalization for vectors (#3146)
This fixes some fuzz bugs that came about enabling simd where nan
canonicalization is performed on the fuzzers but cranelift would panic
on these ops for vectors. This adds some custom codegen with `bitselect`
to ensure any nan lanes are canonical-nan lanes in the canonicalized
operations.
2021-08-05 13:44:16 -05:00
Alex Crichton
4cfa031c5f Implement API support for v128-globals (#3147)
Found via fuzzing, and looks like these were accidentally left out along
the way SIMD was taking shape.
2021-08-05 13:02:34 -05:00
Alex Crichton
85f16f488d Consolidate address calculations for atomics (#3143)
* Consolidate address calculations for atomics

This commit consolidates all calcuations of guest addresses into one
`prepare_addr` function. This notably remove the atomics-specifics paths
as well as the `prepare_load` function (now renamed to `prepare_addr`
and folded into `get_heap_addr`).

The goal of this commit is to simplify how addresses are managed in the
code generator for atomics to use all the shared infrastrucutre of other
loads/stores as well. This additionally fixes #3132 via the use of
`heap_addr` in clif for all operations.

I also added a number of tests for loads/stores with varying alignments.
Originally I was going to allow loads/stores to not be aligned since
that's what the current formal specification says, but the overview of
the threads proposal disagrees with the formal specification, so I
figured I'd leave it as-is but adding tests probably doesn't hurt.

Closes #3132

* Fix old backend

* Guarantee misalignment checks happen before out-of-bounds
2021-08-04 15:57:56 -05:00
Alex Crichton
91d24b8448 Fix pooling tests on high-cpu-count systems (#3141)
This commit fixes an issue where `cargo test` was failing pretty
reliably on an 80-thread system where many of the pooling tests would
fail in `mmap` to reserve address space for the linear memories
allocated for a pooling allocator. Each test wants to reserve about 6TB
of address space, and if we let 80 tests do that apparently Linux
doesn't like that and starts returning errors from `mmap`.

The implementation here is a relatively simple semaphore-lookalike
which allows a fixed amount of concurrency in pooling tests.
2021-08-04 11:55:52 -05:00
Alex Crichton
63a3bbbf5a Change VMMemoryDefinition::current_length to usize (#3134)
* Change VMMemoryDefinition::current_length to `usize`

This commit changes the definition of
`VMMemoryDefinition::current_length` to `usize` from its previous
definition of `u32`. This is a pretty impactful change because it also
changes the cranelift semantics of "dynamic" heaps where the bound
global value specifier must now match the pointer type for the platform
rather than the index type for the heap.

The motivation for this change is that the `current_length` field (or
bound for the heap) is intended to reflect the current size of the heap.
This is bound by `usize` on the host platform rather than `u32` or`
u64`. The previous choice of `u32` couldn't represent a 4GB memory
because we couldn't put a number representing 4GB into the
`current_length` field. By using `usize`, which reflects the host's
memory allocation, this should better reflect the size of the heap and
allows Wasmtime to support a full 4GB heap for a wasm program (instead
of 4GB minus one page).

This commit also updates the legalization of the `heap_addr` clif
instruction to appropriately cast the address to the platform's pointer
type, handling bounds checks along the way. The practical impact for
today's targets is that a `uextend` is happening sooner than it happened
before, but otherwise there is no intended impact of this change. In the
future when 64-bit memories are supported there will likely need to be
fancier logic which handles offsets a bit differently (especially in the
case of a 64-bit memory on a 32-bit host).

The clif `filetest` changes should show the differences in codegen, and
the Wasmtime changes are largely removing casts here and there.

Closes #3022

* Add tests for memory.size at maximum memory size

* Add a dfg helper method
2021-08-02 13:09:40 -05:00
Alex Crichton
9b088756b3 Implement Linker::module_async (#3121)
This implements and adds the async counterpart of the `Linker::module`
method.

Closes #3077
2021-07-27 16:17:45 -05:00
Andrew Brown
4deed8fe50 refactor: move Wasm test files to tests/all/cli_tests
Previously the inputs to `tests/all/cli_tests.rs` were contained in
`tests/wasm`. This change moves them to the more obvious
`tests/all/cli_tests` directory and updates the paths that point to
them.
2021-07-21 16:17:16 -07:00
Alex Crichton
3da677796b Reword env var hint for dwarf debug info (#3081)
* Reword env var hint for dwarf debug info

Try not to declare that more information will indeed be displayed,
instead suggest that the output may improve if the env var is set since
dwarf debug info wasn't parsed.

cc bytecodealliance/wasmtime-go#90

* Fix test assertion
2021-07-15 16:33:47 -05:00
Alex Crichton
13d317a0a8 Fix stack checks of recursive async function calls (#3088)
* Fix stack checks of recursive async function calls

Previously the stack pointer limit wasn't adjusted, even in the face of
stack switching. This commit updates the logic around the stack limit
calculation to configure it on all async function calls, even if they're
recursive. Synchronous function calls, however, continue to only
configure the stack limit at the start, not for recursive calls.

* Update crates/wasmtime/src/func.rs

Co-authored-by: Peter Huene <peter@huene.dev>

Co-authored-by: Peter Huene <peter@huene.dev>
2021-07-14 16:32:30 -05:00
Ulan Degenbaev
f08491eeca Restore POSIX signal handling on MacOS behind a feature flag (#3063)
* Restore POSIX signal handling on MacOS behind a feature flag

As described in Issue #3052, the switch to Mach Exception handling
removed `unix::StoreExt` from the public API of crate on MacOS.
That is a breaking change and makes it difficult for some
application to upgrade to the current stable Wasmtime.

As a workaround this PR introduces a feature flag called
`posix-signals-on-macos` that restores the old behaviour on MacOS.
The flag is disabled by default.

* Fix test guard

* Fix formatting in the test
2021-07-12 16:25:44 -05:00
Peter Huene
89ed663058 Merge pull request #3068 from fitzgen/error-msg-off-by-one
Fix error messages reporting number of expected vs actual params
2021-07-07 17:50:13 -07:00
Nick Fitzgerald
be60fec6ba Fix error messages reporting number of expected vs actual params
We previously had some off-by-one errors in our error messages and this led to
very confusing messages like "expected 0 types, found 0" that were quite
annoying to debug as an API consumer.
2021-07-07 11:32:40 -07:00
Alex Crichton
b9985fe2e5 Change the injection count of fuel in a store from u32 to u64 (#3048)
* Change the injection count of fuel in a store from u32 to u64

This commit updates the type of the amount of times to inject fuel in
the `out_of_fuel_async_yield` to `u64` instead of `u32`. This should
allow effectively infinite fuel to get injected, even if a small amount
of fuel is injected per iteration.

Closes #2927
Closes #3046

* Fix tokio example
2021-07-01 10:46:21 -05:00
Ulrich Weigand
83007b79e3 Fix access to VMMemoryDefinition::current_length on big-endian (#3013)
The current_length member is defined as "usize" in Rust code,
but generated wasm code refers to it as if it were "u32".
While this happens to mostly work on little-endian machines
(as long as the length is < 4GB), it will always fail on
big-endian machines.

Fixed by making current_length "u32" in Rust as well, and
ensuring the actual memory size is always less than 4GB.
2021-06-23 11:45:32 -05:00
Ulrich Weigand
acdb388580 Fix offsets_static_dynamic_oh_my failure on s390x (#3015)
The test unconditionally emits a SIMD load, which fails on s390x
as we do not yet support SIMD.  Disable this particular part of
the test on s390x.
2021-06-22 09:14:07 -05:00
Alex Crichton
8760bccc8e Fix running enter/exit hooks on start functions (#3001)
This commit fixes running the store's enter/exit hooks into wasm which
accidentally weren't run for an instance's `start` function. The fix
here was mostly to just sink the enter/exit hook much lower in the code
to `invoke_wasm_and_catch_traps`, which is the common entry point for
all wasm calls.

This did involve propagating the `StoreContext<T>` generic rather than
using `StoreOpaque` unfortunately, but it is overally not too too much
code and we generally wanted most of it inlined anyway.
2021-06-21 16:31:10 -05:00
Anton Kirilov
cb93726250 Enable more tests on AArch64 (#2994)
Copyright (c) 2021, Arm Limited.
2021-06-21 12:26:44 -05:00
Alex Crichton
7ce46043dc Add guard pages to the front of linear memories (#2977)
* Add guard pages to the front of linear memories

This commit implements a safety feature for Wasmtime to place guard
pages before the allocation of all linear memories. Guard pages placed
after linear memories are typically present for performance (at least)
because it can help elide bounds checks. Guard pages before a linear
memory, however, are never strictly needed for performance or features.
The intention of a preceding guard page is to help insulate against bugs
in Cranelift or other code generators, such as CVE-2021-32629.

This commit adds a `Config::guard_before_linear_memory` configuration
option, defaulting to `true`, which indicates whether guard pages should
be present both before linear memories as well as afterwards. Guard
regions continue to be controlled by
`{static,dynamic}_memory_guard_size` methods.

The implementation here affects both on-demand allocated memories as
well as the pooling allocator for memories. For on-demand memories this
adjusts the size of the allocation as well as adjusts the calculations
for the base pointer of the wasm memory. For the pooling allocator this
will place a singular extra guard region at the very start of the
allocation for memories. Since linear memories in the pooling allocator
are contiguous every memory already had a preceding guard region in
memory, it was just the previous memory's guard region afterwards. Only
the first memory needed this extra guard.

I've attempted to write some tests to help test all this, but this is
all somewhat tricky to test because the settings are pretty far away
from the actual behavior. I think, though, that the tests added here
should help cover various use cases and help us have confidence in
tweaking the various `Config` settings beyond their defaults.

Note that this also contains a semantic change where
`InstanceLimits::memory_reservation_size` has been removed. Instead this
field is now inferred from the `static_memory_maximum_size` and guard
size settings. This should hopefully remove some duplication in these
settings, canonicalizing on the guard-size/static-size settings as the
way to control memory sizes and virtual reservations.

* Update config docs

* Fix a typo

* Fix benchmark

* Fix wasmtime-runtime tests

* Fix some more tests

* Try to fix uffd failing test

* Review items

* Tweak 32-bit defaults

Makes the pooling allocator a bit more reasonable by default on 32-bit
with these settings.
2021-06-18 09:57:08 -05:00
Pat Hickey
8b4bdf92e2 make ResourceLimiter operate on Store data; add hooks for entering and exiting native code (#2952)
* wasmtime_runtime: move ResourceLimiter defaults into this crate

In preparation of changing wasmtime::ResourceLimiter to be a re-export
of this definition, because translating between two traits was causing
problems elsewhere.

* wasmtime: make ResourceLimiter a re-export of wasmtime_runtime::ResourceLimiter

* refactor Store internals to support ResourceLimiter as part of store's data

* add hooks for entering and exiting native code to Store

* wasmtime-wast, fuzz: changes to adapt ResourceLimiter API

* fix tests

* wrap calls into wasm with entering/exiting exit hooks as well

* the most trivial test found a bug, lets write some more

* store: mark some methods as #[inline] on Store, StoreInner, StoreInnerMost

Co-authored-By: Alex Crichton <alex@alexcrichton.com>

* improve tests for the entering/exiting native hooks

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2021-06-08 09:37:00 -05:00
Pat Hickey
895ee2b85f make Module::deserialize's version check optional via Config (#2945)
* make Module::deserialize's version check optional via Config

A SerializedModule contains the CARGO_PKG_VERSION string, which is
checked for equality when loading. This is a great guard-rail but
some users may want to disable this check (e.g. so they can implement
their own versioning scheme)

* rename config to deserialize_check_wasmtime_version

* add test

* fix doc links

* fix

* thank you rustdoc
2021-06-04 14:18:02 -05:00
Alex Crichton
05baddfb2b Add the ability to cache typechecking an instance (#2962)
* Add the ability to cache typechecking an instance

This commit adds the abilty to cache the type-checked imports of an
instance if an instance is going to be instantiated multiple times. This
can also be useful to do a "dry run" of instantiation where no wasm code
is run but it's double-checked that a `Linker` possesses everything
necessary to instantiate the provided module.

This should ideally help cut down repeated instantiation costs slightly
by avoiding type-checking and allocation a `Vec<Extern>` on each
instantiation. It's expected though that the impact on instantiation
time is quite small and likely not super significant. The functionality,
though, of pre-checking can be useful for some embeddings.

* Fix build with async
2021-06-03 17:04:07 -05:00
Alex Crichton
7a1b7cdf92 Implement RFC 11: Redesigning Wasmtime's APIs (#2897)
Implement Wasmtime's new API as designed by RFC 11. This is quite a large commit which has had lots of discussion externally, so for more information it's best to read the RFC thread and the PR thread.
2021-06-03 09:10:53 -05:00
Pat Hickey
0f5bdc6497 only wasi_cap_std_sync and wasi_tokio need to define WasiCtxBuilders (#2917)
* wasmtime-wasi: re-exporting this WasiCtxBuilder was shadowing the right one

wasi-common's WasiCtxBuilder is really only useful wasi_cap_std_sync and
wasi_tokio to implement their own Builder on top of.

This re-export of wasi-common's is 1. not useful and 2. shadow's the
re-export of the right one in sync::*.

* wasi-common: eliminate WasiCtxBuilder, make the builder methods on WasiCtx instead

* delete wasi-common::WasiCtxBuilder altogether

just put those methods directly on &mut WasiCtx.

As a bonus, the sync and tokio WasiCtxBuilder::build functions
are no longer fallible!

* bench fixes

* more test fixes
2021-05-21 12:59:39 -05:00
Peter Huene
91f64d40d4 Implement the allow-unknown-exports option for the run command.
This commit implements the `--allow-unknown-exports` option to the CLI run
command that will ignore unknown exports in a command module rather than
return an error.

Fixes #2587.
2021-05-06 14:23:08 -07:00
Alex Crichton
7ec073cef1 Bring back per-thread lazy initialization (#2863)
* Bring back per-thread lazy initialization

Platforms Wasmtime supports may have per-thread initialization that
needs to run before WebAssembly. For example Unix needs to setup a
sigaltstack and macOS needs to set up mach ports. In #2757 this
per-thread setup was moved out of the invocation of a wasm function,
relying on the lack of Send for Store to initialize the thread at Store
creation time and never worry about it later.

This conflicted with [wasmtime's desired multithreading
story](https://github.com/bytecodealliance/wasmtime/pull/2812) so a new
[`Store::notify_switched_thread` was
added](https://github.com/bytecodealliance/wasmtime/pull/2822) to
explicitly indicate a Store has moved to another thread (if it unsafely
did so).

It turns out though that it's not always easy to determine when a
`Store` moves to a new thread. For example the Go bindings for Wasmtime
are generally unaware when a goroutine switches OS threads. This led to
https://github.com/bytecodealliance/wasmtime-go/issues/74 where a SIGILL
was left uncaught, making it appear that traps aren't working properly.

This commit revisits the decision in #2757 and moves per-thread
initialization back into the path of calling into WebAssembly. This is
differently from before, though, where there's still only one TLS access
on the path of calling into WebAssembly, unlike before where it was a
separate access. This allows us to get the speed benefits of #2757 as
well as the flexibility benefits of not having to explicitly move a
store between threads.

With this new ability this commit deletes the recently added
`Store::notify_switched_thread` method since it's no longer necessary.

* Fix a test compiling
2021-04-28 12:08:27 -05:00
Alex Crichton
8384f3a347 Bring back Module::deserialize (#2858)
* Bring back `Module::deserialize`

I thought I was being clever suggesting that `Module::deserialize` was
removed from #2791 by funneling all module constructors into
`Module::new`. As our studious fuzzers have found, though, this means
that `Module::new` is not safe currently to pass arbitrary user-defined
input into. Now one might pretty reasonable expect to be able to do
that, however, being a WebAssembly engine and all. This PR as a result
separates the `deserialize` part of `Module::new` back into
`Module::deserialize`.

This means that binary blobs created with `Module::serialize` and
`Engine::precompile_module` will need to be passed to
`Module::deserialize` to "rehydrate" them back into a `Module`. This
restores the property that it should be safe to pass arbitrary input to
`Module::new` since it's always expected to be a wasm module. This also
means that fuzzing will no longer attempt to fuzz `Module::deserialize`
which isn't something we want to do anyway.

* Fix an example

* Mark `Module::deserialize` as `unsafe`
2021-04-27 10:55:12 -05:00
Alex Crichton
196bcec6cf Process declared element segments for "possibly exported funcs" (#2851)
Now that we're using "possibly exported" as an impactful decision for
codegen (which trampolines to generate and which ABI a function has)
it's important that we calculate this property of a wasm function
correctly! Previously Wasmtime forgot to processed "declared" elements
in apart from active/passive element segments, but this updates Wasmtime
to ensure that these entries are processed and all the functions
contained within are flagged as "possibly exported".

Closes #2850
2021-04-20 16:52:51 -05:00
Peter Huene
f12b4c467c Add resource limiting to the Wasmtime API. (#2736)
* Add resource limiting to the Wasmtime API.

This commit adds a `ResourceLimiter` trait to the Wasmtime API.

When used in conjunction with `Store::new_with_limiter`, this can be used to
monitor and prevent WebAssembly code from growing linear memories and tables.

This is particularly useful when hosts need to take into account host resource
usage to determine if WebAssembly code can consume more resources.

A simple `StaticResourceLimiter` is also included with these changes that will
simply limit the size of linear memories or tables for all instances created in
the store based on static values.

* Code review feedback.

* Implemented `StoreLimits` and `StoreLimitsBuilder`.
* Moved `max_instances`, `max_memories`, `max_tables` out of `Config` and into
  `StoreLimits`.
* Moved storage of the limiter in the runtime into `Memory` and `Table`.
* Made `InstanceAllocationRequest` use a reference to the limiter.
* Updated docs.
* Made `ResourceLimiterProxy` generic to remove a level of indirection.
* Fixed the limiter not being used for `wasmtime::Memory` and
  `wasmtime::Table`.

* Code review feedback and bug fix.

* `Memory::new` now returns `Result<Self>` so that an error can be returned if
  the initial requested memory exceeds any limits placed on the store.

* Changed an `Arc` to `Rc` as the `Arc` wasn't necessary.

* Removed `Store` from the `ResourceLimiter` callbacks. Custom resource limiter
  implementations are free to capture any context they want, so no need to
  unnecessarily store a weak reference to `Store` from the proxy type.

* Fixed a bug in the pooling instance allocator where an instance would be
  leaked from the pool. Previously, this would only have happened if the OS was
  unable to make the necessary linear memory available for the instance. With
  these changes, however, the instance might not be created due to limits
  placed on the store. We now properly deallocate the instance on error.

* Added more tests, including one that covers the fix mentioned above.

* Code review feedback.

* Add another memory to `test_pooling_allocator_initial_limits_exceeded` to
  ensure a partially created instance is successfully deallocated.
* Update some doc comments for better documentation of `Store` and
  `ResourceLimiter`.
2021-04-19 09:19:20 -05:00
Benjamin Bouvier
ba73b458b8 Introduce a new API that allows notifying that a Store has moved to a new thread (#2822)
* Introduce a new API that allows notifying that a Store has moved to a new thread

* Add backlink to documentation, and mention the new API in the multithreading doc;
2021-04-16 11:15:35 -05:00
Alex Crichton
195bf0e29a Fully support multiple returns in Wasmtime (#2806)
* Fully support multiple returns in Wasmtime

For quite some time now Wasmtime has "supported" multiple return values,
but only in the mose bare bones ways. Up until recently you couldn't get
a typed version of functions with multiple return values, and never have
you been able to use `Func::wrap` with functions that return multiple
values. Even recently where `Func::typed` can call functions that return
multiple values it uses a double-indirection by calling a trampoline
which calls the real function.

The underlying reason for this lack of support is that cranelift's ABI
for returning multiple values is not possible to write in Rust. For
example if a wasm function returns two `i32` values there is no Rust (or
C!) function you can write to correspond to that. This commit, however
fixes that.

This commit adds two new ABIs to Cranelift: `WasmtimeSystemV` and
`WasmtimeFastcall`. The intention is that these Wasmtime-specific ABIs
match their corresponding ABI (e.g. `SystemV` or `WindowsFastcall`) for
everything *except* how multiple values are returned. For multiple
return values we simply define our own version of the ABI which Wasmtime
implements, which is that for N return values the first is returned as
if the function only returned that and the latter N-1 return values are
returned via an out-ptr that's the last parameter to the function.

These custom ABIs provides the ability for Wasmtime to bind these in
Rust meaning that `Func::wrap` can now wrap functions that return
multiple values and `Func::typed` no longer uses trampolines when
calling functions that return multiple values. Although there's lots of
internal changes there's no actual changes in the API surface area of
Wasmtime, just a few more impls of more public traits which means that
more types are supported in more places!

Another change made with this PR is a consolidation of how the ABI of
each function in a wasm module is selected. The native `SystemV` ABI,
for example, is more efficient at returning multiple values than the
wasmtime version of the ABI (since more things are in more registers).
To continue to take advantage of this Wasmtime will now classify some
functions in a wasm module with the "fast" ABI. Only functions that are
not reachable externally from the module are classified with the fast
ABI (e.g. those not exported, used in tables, or used with `ref.func`).
This should enable purely internal functions of modules to have a faster
calling convention than those which might be exposed to Wasmtime itself.

Closes #1178

* Tweak some names and add docs

* "fix" lightbeam compile

* Fix TODO with dummy environ

* Unwind info is a property of the target, not the ABI

* Remove lightbeam unused imports

* Attempt to fix arm64

* Document new ABIs aren't stable

* Fix filetests to use the right target

* Don't always do 64-bit stores with cranelift

This was overwriting upper bits when 32-bit registers were being stored
into return values, so fix the code inline to do a sized store instead
of one-size-fits-all store.

* At least get tests passing on the old backend

* Fix a typo

* Add some filetests with mixed abi calls

* Get `multi` example working

* Fix doctests on old x86 backend

* Add a mixture of wasmtime/system_v tests
2021-04-07 12:34:26 -05:00
Benjamin Bouvier
7588565078 Tweaks some tests for Mac aarch64
- some tests don't pass because of bad interactions with the system's
libunwind; ignore them for now.
- the page size on mac aarch64 is 16K, not 4K; tweak some tests which
were expecting 4K or multiples of 4K pages to use a multiple of host page size
instead.
- a cranelift-native test needed an update for the new calling convention.
2021-04-07 14:54:50 +02:00
Chris Fallin
6bec13da04 Bump versions: Wasmtime to 0.26.0, Cranelift to 0.73.0. 2021-04-05 10:48:42 -07:00
Chris Fallin
8d78212a15 Merge pull request #2718 from cfallin/new-backend
Switch default to new x86_64 backend.
2021-04-05 09:38:08 -07:00
Alex Crichton
04bf6e5bbb Move some scopes around to fix a leak on raising a trap (#2803)
Some recent refactorings accidentally had a local `Store` on the stack
when a longjmp was initiated, bypassing its destructor and causing
`Store` to leak.

Closes #2802
2021-04-05 10:29:18 -05:00
Peter Huene
37bb7af454 Fix incorrect range in ininitialize_instance.
This commit fixes a bug where the wrong destination range was used when copying
data from the module's memory initialization upon instance initialization.

This affects the on-demand allocator only when using the `uffd` feature on
Linux and when the Wasm page being initialized is not the last in the module's
initial pages.

Fixes #2784.
2021-04-02 16:27:22 -07:00
Chris Fallin
cb48ea406e Switch default to new x86_64 backend.
This PR switches the default backend on x86, for both the
`cranelift-codegen` crate and for Wasmtime, to the new
(`MachInst`-style, `VCode`-based) backend that has been under
development and testing for some time now.

The old backend is still available by default in builds with the
`old-x86-backend` feature, or by requesting `BackendVariant::Legacy`
from the appropriate APIs.

As part of that switch, it adds some more runtime-configurable plumbing
to the testing infrastructure so that tests can be run using the
appropriate backend. `clif-util test` is now capable of parsing a
backend selector option from filetests and instantiating the correct
backend.

CI has been updated so that the old x86 backend continues to run its
tests, just as we used to run the new x64 backend separately.

At some point, we will remove the old x86 backend entirely, once we are
satisfied that the new backend has not caused any unforeseen issues and
we do not need to revert.
2021-04-02 11:35:53 -07:00
Peter Huene
b7b47e380d Merge pull request #2791 from peterhuene/compile-command
Add a compile command to Wasmtime.
2021-04-02 11:18:14 -07:00
Alex Crichton
29949505d6 Fix printing float results from the CLI (#2797)
Previously their bit patterns were printed interpreted as decimals, now
they're printed as floats.
2021-04-02 10:07:59 -05:00
Peter Huene
d1313b1291 Code review feedback.
* Move `Module::compile` to `Engine::precompile_module`.
* Remove `Module::deserialize` method.
* Make `Module::serialize` the same format as `Engine::precompile_module`.
* Make `Engine::precompile_module` return a `Vec<u8>`.
* Move the remaining serialization-related code to `serialization.rs`.
2021-04-01 19:38:19 -07:00
Peter Huene
abf3bf29f9 Add a wasmtime settings command to print Cranelift settings.
This commit adds the `wasmtime settings` command to print out available
Cranelift settings for a target (defaults to the host).

The compile command has been updated to remove the Cranelift ISA options in
favor of encouraging users to use `wasmtime settings` to discover what settings
are available.  This will reduce the maintenance cost for syncing the compile
command with Cranelift ISA flags.
2021-04-01 19:38:19 -07:00
Peter Huene
a474524d3b Code review feedback.
* Removed `Config::cranelift_clear_cpu_flags`.
* Renamed `Config::cranelift_other_flag` to `Config::cranelift::flag_set`.
* Renamed `--cranelift-flag` to `--cranelift-set`.
* Renamed `--cranelift-preset` to `--cranelift-enable`.
2021-04-01 19:38:19 -07:00