Commit Graph

8667 Commits

Author SHA1 Message Date
Dan Gohman
dd7e16762c Arrange for the new test to be called. 2021-03-22 12:50:16 -07:00
Dan Gohman
6b40724d18 Support "sleep" forms of poll_oneoff.
Add support for `poll_oneoff` calls which just sleep on a relative
timeout. This fixes a bug handling code compiled with WASI libc's `sleep`
family of functions, which call `poll_oneoff` with a `CLOCK_REALTIME`
timer, which wasn't previously implemented.
2021-03-22 12:50:16 -07:00
Dan Gohman
cba0144612 Use min_by instead of sort_by when we only want the minimum element.
This is just a minor code simplification I happened to notice while
doing unrelated work on `poll_oneoff`.
2021-03-22 11:08:28 -07:00
Peter Huene
4471d27567 Merge pull request #2741 from peterhuene/refactor-fiber-stacks
Split out fiber stacks from fibers.
2021-03-22 11:05:16 -07:00
bjorn3
b321a7291d Clarify ownership of data returned by get_finalized_* 2021-03-22 09:48:09 -07:00
Benjamin Bouvier
6e6713ae0b cranelift: add support for the Mac aarch64 calling convention
This bumps target-lexicon and adds support for the AppleAarch64 calling
convention. Specifically for WebAssembly support, we only have to worry
about the new stack slots convention. Stack slots don't need to be at
least 8-bytes, they can be as small as the data type's size. For
instance, if we need stack slots for (i32, i32), they can be located at
offsets (+0, +4). Note that they still need to be properly aligned on
the data type they're containing, though, so if we need stack slots for
(i32, i64), we can't start the i64 slot at the +4 offset (it must start
at the +8 offset).

Added one test that was failing on the Mac M1, as well as other tests
stressing different yet similar situations.
2021-03-22 10:06:13 +01:00
bjorn3
cc89111463 Support declaring anonymous functions and data objects 2021-03-21 18:00:26 +01:00
MaxGraey
1425c1e7bf fix emit_small_memset
use shifts


format


more


fix


revert to multiply
2021-03-21 17:03:25 +02:00
bjorn3
1639f2c844 Allow passing arbitrary MemFlags to emit_small_mem{cpy,move} 2021-03-20 19:22:55 +01:00
Peter Huene
8e43e96410 Merge pull request #2743 from dalinaum/patch-1
Fix an incorrect link.
2021-03-20 00:07:54 -07:00
Peter Huene
e6dda413a4 Code review feedback.
* Add assert to `StackPool::deallocate` to ensure the fiber stack given to it
  comes from the pool.
* Remove outdated comment about windows and stacks as the allocator now returns
  fiber stacks.
* Remove conditional compilation around `stack_size` in the allocators as it
  was just clutter.
2021-03-20 00:05:08 -07:00
Peter Huene
f556bd18a7 Set the thread stack guarantee for fibers on Windows.
This commit fixes the Windows implementation of fibers in Wasmtime to
reserve enough staack space for Rust to handle any stack overflow
exceptions.
2021-03-19 14:48:36 -07:00
LYK
3f2d36d532 Fix an incorrect link. 2021-03-20 03:41:03 +09:00
Peter Huene
8e34022784 Add tests for hitting fiber stack guard pages. 2021-03-18 23:57:42 -07:00
Peter Huene
f8f51afac1 Split out fiber stacks from fibers.
This commit splits out a `FiberStack` from `Fiber`, allowing the instance
allocator trait to return `FiberStack` rather than raw stack pointers. This
keeps the stack creation mostly in `wasmtime_fiber`, but now the on-demand
instance allocator can make use of it.

The instance allocators no longer have to return a "not supported" error to
indicate that the store should allocate its own fiber stack.

This includes a bunch of cleanup in the instance allocator to scope stacks to
the new "async" feature in the runtime.

Closes #2708.
2021-03-18 20:21:02 -07:00
Pat Hickey
0394a01194 Merge pull request #2739 from wrbs/stack-map-sink
cranelift-module: Add support for passing a StackMapSink when defining functions
2021-03-18 17:49:36 -07:00
Will Robson
38926fb1fc cranelift-module: Add support for passing a StackMapSink when defining functions
Fixes #2738

This follows the convention set by the existing method of passing a
TrapSink by adding another argument for a StackMapSink.
2021-03-19 00:02:15 +00:00
Chris Fallin
59dfe4b9f4 Merge pull request #2740 from cfallin/fix-lldb-ci
Explicitly install LLDB in CI to fix intermittent failure on Ubuntu 20.04 image.
2021-03-18 12:34:52 -07:00
Chris Fallin
69f27c06d2 Explicitly install LLDB in CI to fix intermittent failure on 20.04 image. 2021-03-18 11:13:14 -07:00
Benjamin Bouvier
5fecdfa491 Mach ports continued + support aarch64-apple unwinding (#2723)
* Switch macOS to using mach ports for trap handling

This commit moves macOS to using mach ports instead of signals for
handling traps. The motivation for this is listed in #2456, namely that
once mach ports are used in a process that means traditional UNIX signal
handlers won't get used. This means that if Wasmtime is integrated with
Breakpad, for example, then Wasmtime's trap handler never fires and
traps don't work.

The `traphandlers` module is refactored as part of this commit to split
the platform-specific bits into their own files (it was growing quite a
lot for one inline `cfg_if!`). The `unix.rs` and `windows.rs` files
remain the same as they were before with a few minor tweaks for some
refactored interfaces. The `macos.rs` file is brand new and lifts almost
its entire implementation from SpiderMonkey, adapted for Wasmtime
though.

The main gotcha with mach ports is that a separate thread is what
services the exception. Some unsafe magic allows this separate thread to
read non-`Send` and temporary state from other threads, but is hoped to
be safe in this context. The unfortunate downside is that calling wasm
on macOS now involves taking a global lock and modifying a global hash
map twice-per-call. I'm not entirely sure how to get out of this cost
for now, but hopefully for any embeddings on macOS it's not the end of
the world.

Closes #2456

* Add a sketch of arm64 apple support

* store: maintain CallThreadState mapping when switching fibers

* cranelift/aarch64: generate unwind directives to disable pointer auth

Aarch64 post ARMv8.3 has a feature called pointer authentication,
designed to fight ROP/JOP attacks: some pointers may be signed using new
instructions, adding payloads to the high (previously unused) bits of
the pointers. More on this here: https://lwn.net/Articles/718888/

Unwinders on aarch64 need to know if some pointers contained on the call
frame contain an authentication code or not, to be able to properly
authenticate them or use them directly. Since native code may have
enabled it by default (as is the case on the Mac M1), and the default is
that this configuration value is inherited, we need to explicitly
disable it, for the only kind of supported pointers (return addresses).

To do so, we set the value of a non-existing dwarf pseudo register (34)
to 0, as documented in
https://github.com/ARM-software/abi-aa/blob/master/aadwarf64/aadwarf64.rst#note-8.

This is done at the function granularity, in the spirit of Cranelift
compilation model. Alternatively, a single directive could be generated
in the CIE, generating less information per module.

* Make exception handling work on Mac aarch64 too

* fibers: use a breakpoint instruction after the final call in wasmtime_fiber_start

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
2021-03-17 09:43:22 -05:00
Benjamin Bouvier
4603b3b292 Bump dependencies to get a single version of rand (#2733)
This removes a few crates in the dependencies, and a few exceptions (at
the price of a new one) in the cargo-deny configuration.
2021-03-17 09:07:50 -05:00
Nick Fitzgerald
a8aaf812ef Merge pull request #2731 from fitzgen/make-0.25.0-release
Make 0.25.0 release
2021-03-16 13:20:42 -07:00
Nick Fitzgerald
fe933c601a Fix day on date in RELEASES.md 2021-03-16 12:34:09 -07:00
Nick Fitzgerald
72b2bde808 Fix year on date in RELEASES.md
Co-authored-by: Rémy Rakic <remy.rakic+github@gmail.com>
2021-03-16 12:33:22 -07:00
Nick Fitzgerald
2b57cd16c8 Fix date formatting in RELEASES.md
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
2021-03-16 11:26:04 -07:00
Nick Fitzgerald
b92893e11c Add the release date for 0.25.0 in RELEASES.md 2021-03-16 11:04:24 -07:00
Nick Fitzgerald
d081ef9c2e Bump Wasmtime to 0.25.0; Cranelift to 0.72.0 2021-03-16 11:02:56 -07:00
Dan Gohman
2d3f2adf04 Fix nondeterministic failures in poll_oneoff_stdio.
Adjust this test so that it tolerates poll_oneoff returning that both a
timeout occurred and an input is ready for reading, at the same time.
2021-03-15 11:48:19 -07:00
Chris Fallin
a46daa7eee Merge pull request #2724 from akirilov-arm/aarch64_atomics
Cranelift AArch64: Add initial support for the Armv8.1 atomics
2021-03-15 11:21:48 -07:00
Anton Kirilov
07c27039b1 Cranelift AArch64: Add initial support for the Armv8.1 atomics
This commit enables Cranelift's AArch64 backend to generate code
for instruction set extensions (previously only the base Armv8-A
architecture was supported); also, it makes it possible to detect
the extensions supported by the host when JIT compiling. The new
functionality is applied to the IR instruction `AtomicCas`.

Copyright (c) 2021, Arm Limited.
2021-03-13 02:31:51 +00:00
Chris Fallin
df6812b855 Merge pull request #2710 from cfallin/x64-fastcall-unwind
Rework/simplify unwind infrastructure, implement Windows unwind, and add Windows/new-backend to CI.
2021-03-11 20:47:58 -08:00
Chris Fallin
2d5db92a9e Rework/simplify unwind infrastructure and implement Windows unwind.
Our previous implementation of unwind infrastructure was somewhat
complex and brittle: it parsed generated instructions in order to
reverse-engineer unwind info from prologues. It also relied on some
fragile linkage to communicate instruction-layout information that VCode
was not designed to provide.

A much simpler, more reliable, and easier-to-reason-about approach is to
embed unwind directives as pseudo-instructions in the prologue as we
generate it. That way, we can say what we mean and just emit it
directly.

The usual reasoning that leads to the reverse-engineering approach is
that metadata is hard to keep in sync across optimization passes; but
here, (i) prologues are generated at the very end of the pipeline, and
(ii) if we ever do a post-prologue-gen optimization, we can treat unwind
directives as black boxes with unknown side-effects, just as we do for
some other pseudo-instructions today.

It turns out that it was easier to just build this for both x64 and
aarch64 (since they share a factored-out ABI implementation), and wire
up the platform-specific unwind-info generation for Windows and SystemV.
Now we have simpler unwind on all platforms and we can delete the old
unwind infra as soon as we remove the old backend.

There were a few consequences to supporting Fastcall unwind in
particular that led to a refactor of the common ABI. Windows only
supports naming clobbered-register save locations within 240 bytes of
the frame-pointer register, whatever one chooses that to be (RSP or
RBP). We had previously saved clobbers below the fixed frame (and below
nominal-SP). The 240-byte range has to include the old RBP too, so we're
forced to place clobbers at the top of the frame, just below saved
RBP/RIP. This is fine; we always keep a frame pointer anyway because we
use it to refer to stack args. It does mean that offsets of fixed-frame
slots (spillslots, stackslots) from RBP are no longer known before we do
regalloc, so if we ever want to index these off of RBP rather than
nominal-SP because we add support for `alloca` (dynamic frame growth),
then we'll need a "nominal-BP" mode that is resolved after regalloc and
clobber-save code is generated. I added a comment to this effect in
`abi_impl.rs`.

The above refactor touched both x64 and aarch64 because of shared code.
This had a further effect in that the old aarch64 prologue generation
subtracted from `sp` once to allocate space, then used stores to `[sp,
offset]` to save clobbers. Unfortunately the offset only has 7-bit
range, so if there are enough clobbered registers (and there can be --
aarch64 has 384 bytes of registers; at least one unit test hits this)
the stores/loads will be out-of-range. I really don't want to synthesize
large-offset sequences here; better to go back to the simpler
pre-index/post-index `stp r1, r2, [sp, #-16]` form that works just like
a "push". It's likely not much worse microarchitecturally (dependence
chain on SP, but oh well) and it actually saves an instruction if
there's no other frame to allocate. As a further advantage, it's much
simpler to understand; simpler is usually better.

This PR adds the new backend on Windows to CI as well.
2021-03-11 20:03:52 -08:00
Peter Huene
71093ff91b Merge pull request #2722 from peterhuene/update-release-notes
Update RELEASES.md to mention the change to `Engine::new`.
2021-03-11 16:40:48 -08:00
Peter Huene
6925314738 Update RELEASES.md to mention the change to Engine::new. 2021-03-11 14:14:32 -08:00
Alex Crichton
fb0dc1045f Update release notes for next version (#2721)
Add some notes for major features which have landed
2021-03-11 16:06:41 -06:00
Alex Crichton
2697a18d2f Redo the statically typed Func API (#2719)
* Redo the statically typed `Func` API

This commit reimplements the `Func` API with respect to statically typed
dispatch. Previously `Func` had a `getN` and `getN_async` family of
methods which were implemented for 0 to 16 parameters. The return value
of these functions was an `impl Fn(..)` closure with the appropriate
parameters and return values.

There are a number of downsides with this approach that have become
apparent over time:

* The addition of `*_async` doubled the API surface area (which is quite
  large here due to one-method-per-number-of-parameters).
* The [documentation of `Func`][old-docs] are quite verbose and feel
  "polluted" with all these getters, making it harder to understand the
  other methods that can be used to interact with a `Func`.
* These methods unconditionally pay the cost of returning an owned `impl
  Fn` with a `'static` lifetime. While cheap, this is still paying the
  cost for cloning the `Store` effectively and moving data into the
  closed-over environment.
* Storage of the return value into a struct, for example, always
  requires `Box`-ing the returned closure since it otherwise cannot be
  named.
* Recently I had the desire to implement an "unchecked" path for
  invoking wasm where you unsafely assert the type signature of a wasm
  function. Doing this with today's scheme would require doubling
  (again) the API surface area for both async and synchronous calls,
  further polluting the documentation.

The main benefit of the previous scheme is that by returning a `impl Fn`
it was quite easy and ergonomic to actually invoke the function. In
practice, though, examples would often have something akin to
`.get0::<()>()?()?` which is a lot of things to interpret all at once.
Note that `get0` means "0 parameters" yet a type parameter is passed.
There's also a double function invocation which looks like a lot of
characters all lined up in a row.

Overall, I think that the previous design is starting to show too many
cracks and deserves a rewrite. This commit is that rewrite.

The new design in this commit is to delete the `getN{,_async}` family of
functions and instead have a new API:

    impl Func {
        fn typed<P, R>(&self) -> Result<&Typed<P, R>>;
    }

    impl Typed<P, R> {
        fn call(&self, params: P) -> Result<R, Trap>;
        async fn call_async(&self, params: P) -> Result<R, Trap>;
    }

This should entirely replace the current scheme, albeit by slightly
losing ergonomics use cases. The idea behind the API is that the
existence of `Typed<P, R>` is a "proof" that the underlying function
takes `P` and returns `R`. The `Func::typed` method peforms a runtime
type-check to ensure that types all match up, and if successful you get
a `Typed` value. Otherwise an error is returned.

Once you have a `Typed` then, like `Func`, you can either `call` or
`call_async`. The difference with a `Typed`, however, is that the
params/results are statically known and hence these calls can be much
more efficient.

This is a much smaller API surface area from before and should greatly
simplify the `Func` documentation. There's still a problem where
`Func::wrapN_async` produces a lot of functions to document, but that's
now the sole offender. It's a nice benefit that the
statically-typed-async verisons are now expressed with an `async`
function rather than a function-returning-a-future which makes it both
more efficient and easier to understand.

The type `P` and `R` are intended to either be bare types (e.g. `i32`)
or tuples of any length (including 0). At this time `R` is only allowed
to be `()` or a bare `i32`-style type because multi-value is not
supported with a native ABI (yet). The `P`, however, can be any size of
tuples of parameters. This is also where some ergonomics are lost
because instead of `f(1, 2)` you now have to write `f.call((1, 2))`
(note the double-parens). Similarly `f()` becomes `f.call(())`.

Overall I feel that this is a better tradeoff than before. While not
universally better due to the loss in ergonomics I feel that this design
is much more flexible in terms of what you can do with the return value
and also understanding the API surface area (just less to take in).

[old-docs]: https://docs.rs/wasmtime/0.24.0/wasmtime/struct.Func.html#method.get0

* Rename Typed to TypedFunc

* Implement multi-value returns through `Func::typed`

* Fix examples in docs

* Fix some more errors

* More test fixes

* Rebasing and adding `get_typed_func`

* Updating tests

* Fix typo

* More doc tweaks

* Tweak visibility on `Func::invoke`

* Fix tests again
2021-03-11 14:43:34 -06:00
Alex Crichton
918c012d00 Fix some issues around TLS management with async (#2709)
This commit fixes a few issues around managing the thread-local state of
a wasmtime thread. We intentionally only have a singular TLS variable in
the whole world, and the problem is that when stack-switching off an
async thread we were not restoring the previous TLS state. This is
necessary in two cases:

* Futures aren't guaranteed to be polled/completed in a stack-like
  fashion. If a poll sees that a future isn't ready then we may resume
  execution in a previous wasm context that ends up needing the TLS
  information.

* Futures can also cross threads (when the whole store crosses threads)
  and we need to save/restore TLS state from the thread we're coming
  from and the thread that we're going to.

The stack switching issue necessitates some more glue around suspension
and resumption of a stack to ensure we save/restore the TLS state on
both sides. The thread issue, however, also necessitates that we use
`#[inline(never)]` on TLS access functions and never have TLS borrows
live across a function which could result in running arbitrary code (as
was the case for the `tls::set` function.
2021-03-11 11:32:33 -06:00
Peter Huene
54c07d8f16 Implement shared host functions. (#2625)
* Implement defining host functions at the Config level.

This commit introduces defining host functions at the `Config` rather than with
`Func` tied to a `Store`.

The intention here is to enable a host to define all of the functions once
with a `Config` and then use a `Linker` (or directly with
`Store::get_host_func`) to use the functions when instantiating a module.

This should help improve the performance of use cases where a `Store` is
short-lived and redefining the functions at every module instantiation is a
noticeable performance hit.

This commit adds `add_to_config` to the code generation for Wasmtime's `Wasi`
type.

The new method adds the WASI functions to the given config as host functions.

This commit adds context functions to `Store`: `get` to get a context of a
particular type and `set` to set the context on the store.

For safety, `set` cannot replace an existing context value of the same type.

`Wasi::set_context` was added to set the WASI context for a `Store` when using
`Wasi::add_to_config`.

* Add `Config::define_host_func_async`.

* Make config "async" rather than store.

This commit moves the concept of "async-ness" to `Config` rather than `Store`.

Note: this is a breaking API change for anyone that's already adopted the new
async support in Wasmtime.

Now `Config::new_async` is used to create an "async" config and any `Store`
associated with that config is inherently "async".

This is needed for async shared host functions to have some sanity check during their
execution (async host functions, like "async" `Func`, need to be called with
the "async" variants).

* Update async function tests to smoke async shared host functions.

This commit updates the async function tests to also smoke the shared host
functions, plus `Func::wrap0_async`.

This also changes the "wrap async" method names on `Config` to
`wrap$N_host_func_async` to slightly better match what is on `Func`.

* Move the instance allocator into `Engine`.

This commit moves the instantiated instance allocator from `Config` into
`Engine`.

This makes certain settings in `Config` no longer order-dependent, which is how
`Config` should ideally be.

This also removes the confusing concept of the "default" instance allocator,
instead opting to construct the on-demand instance allocator when needed.

This does alter the semantics of the instance allocator as now each `Engine`
gets its own instance allocator rather than sharing a single one between all
engines created from a configuration.

* Make `Engine::new` return `Result`.

This is a breaking API change for anyone using `Engine::new`.

As creating the pooling instance allocator may fail (likely cause is not enough
memory for the provided limits), instead of panicking when creating an
`Engine`, `Engine::new` now returns a `Result`.

* Remove `Config::new_async`.

This commit removes `Config::new_async` in favor of treating "async support" as
any other setting on `Config`.

The setting is `Config::async_support`.

* Remove order dependency when defining async host functions in `Config`.

This commit removes the order dependency where async support must be enabled on
the `Config` prior to defining async host functions.

The check is now delayed to when an `Engine` is created from the config.

* Update WASI example to use shared `Wasi::add_to_config`.

This commit updates the WASI example to use `Wasi::add_to_config`.

As only a single store and instance are used in the example, it has no semantic
difference from the previous example, but the intention is to steer users
towards defining WASI on the config and only using `Wasi::add_to_linker` when
more explicit scoping of the WASI context is required.
2021-03-11 10:14:03 -06:00
Chris Fallin
05688aa8f4 Add Windows/MinGW to CI for the new backend in order to test Fastcall. 2021-03-10 18:41:39 -08:00
Christopher Serr
cc84c693a3 wasi-common: Timestamps should be in nanoseconds (#2717)
Sleeping takes 1000x longer than it should because the timestamps are
interpreted as microseconds by accident.
2021-03-10 09:09:34 -06:00
Peter Huene
f8cc824396 Merge pull request #2518 from peterhuene/add-allocator
Implement the pooling instance allocator.
2021-03-08 12:20:31 -08:00
Chris Fallin
58769e5006 Merge pull request #2714 from Amanieu/more_entitylist
EntityList improvments
2021-03-08 11:39:55 -08:00
Peter Huene
623290d42e Use anyhow::Error in instantiation errors.
This commit updates the error enums used in instantiation errors to encapsulate
an `anyhow::Error` rather than a string.
2021-03-08 11:27:30 -08:00
Peter Huene
5fa0f8d469 Move linear memory faulted guard page tracking into Memory.
This commit moves the tracking for faulted guard pages in a linear memory into
`Memory`.
2021-03-08 11:27:25 -08:00
Chris Fallin
7a780d9589 Merge pull request #2713 from abrown/memory_lane_access
[simd] Implement load*_lane and store*_lane
2021-03-08 11:25:15 -08:00
Amanieu d'Antras
9b1693aa72 Add EntityList::truncate 2021-03-08 18:21:02 +00:00
Amanieu d'Antras
65d0bc58d2 Add EntityList::deep_clone 2021-03-08 18:20:46 +00:00
Amanieu d'Antras
b2abe74f25 Improve codegen for remove and swap_remove on EntityList 2021-03-08 18:20:05 +00:00
Andrew Brown
352e51f68d [simd] Implement load*_lane and store*_lane
The Wasm SIMD specification has added new instructions that allow inserting to the lane of a vector from a memory location, and conversely, extracting from a lane of a vector to a memory location. The simplest implementation lowers these instructions, `load[8|16|32|64]_lane` and `store[8|16|32|64]_lane`, to a sequence of either `load + insertlane` or `extractlane + store` (in CLIF). With the new backend's pattern matching, we expect these CLIF sequences to compile as a single machine instruction (at least in x64).
2021-03-08 09:49:44 -08:00
Peter Huene
7a93132ffa Code review feedback.
* Improve comments.
* Drop old table element *after* updating the table.
* Extract out the same `cfg_if!` to a single constant.
2021-03-08 09:04:13 -08:00