Commit Graph

9103 Commits

Author SHA1 Message Date
Chris Fallin
8876bd4fa3 Merge pull request #3396 from sunfishcode/sunishcode/flags-none
Use `empty()` instead of `NONE` with rsix flags types.
2021-09-30 08:58:58 -07:00
Dan Gohman
e5ebef1b94 Use empty() instead of NONE with rsix flags types.
`empty()` is provided by all `bitflags` types, so it's more idiomatic
than having `NONE` values.
2021-09-30 08:14:13 -07:00
Alex Crichton
49767c7379 Validate functions in parallel in Module::validate (#3394)
We already validate wasm functions in parallel when compiling a module,
but the same parallelism wasn't available to the `Module::validate` API.
This commit peforms a minor tweak to the validate-the-whole-module API
to validate all functions in parallel in the same manner that module
compilation does.
2021-09-30 09:09:12 -05:00
Dan Gohman
fa108d9a86 Remove the rsix dependency in cranelift-native. (#3395)
Revert the part of 47490b4383 which
changed cranelift-native to use rsix. It's just one call, and this lets
Cranelift users that don't otherwise depend on rsix to avoid it.
2021-09-30 06:11:29 -07:00
bjorn3
463a88e002 Rename lookup_variant to lookup 2021-09-30 12:42:45 +02:00
bjorn3
4b6d20d03f Fix extend test for AArch64 2021-09-29 19:45:49 +02:00
bjorn3
3fae9e5fa9 Remove outdated tests from cranelift-codegen-meta 2021-09-29 18:43:04 +02:00
bjorn3
a2040542ce Remove unused fields 2021-09-29 18:24:24 +02:00
bjorn3
71907184a5 Rustfmt 2021-09-29 18:21:47 +02:00
bjorn3
a646f68553 Remove legacy x86_64 backend tests 2021-09-29 17:37:23 +02:00
bjorn3
53ec12d519 Rustfmt 2021-09-29 16:27:47 +02:00
bjorn3
9e5201d88f Fix all dead-code warnings in cranelift-codegen 2021-09-29 16:27:47 +02:00
bjorn3
3e4167ba95 Remove registers from cranelift-codegen-meta 2021-09-29 16:27:47 +02:00
bjorn3
18bd27e90b Remove legalizer support from cranelift-codegen-meta 2021-09-29 16:27:45 +02:00
bjorn3
d499933612 Remove encoding generation from cranelift-codegen-meta 2021-09-29 16:23:58 +02:00
bjorn3
d8818c967e Fix all dead-code warnings in cranelift-codegen-meta 2021-09-29 16:23:58 +02:00
bjorn3
59e18b7d1b Remove the old riscv backend 2021-09-29 16:23:57 +02:00
bjorn3
9e34df33b9 Remove the old x86 backend 2021-09-29 16:13:46 +02:00
Alex Crichton
e989caf337 Update RELEASES.md (#3392)
Not exhaustive for the next release but figured I'd get a head-start.
2021-09-28 10:58:29 -05:00
Alex Crichton
1ee2af0098 Remove the lightbeam backend (#3390)
This commit removes the Lightbeam backend from Wasmtime as per [RFC 14].
This backend hasn't received maintenance in quite some time, and as [RFC
14] indicates this doesn't meet the threshold for keeping the code
in-tree, so this commit removes it.

A fast "baseline" compiler may still be added in the future. The
addition of such a backend should be in line with [RFC 14], though, with
the principles we now have for stable releases of Wasmtime. I'll close
out Lightbeam-related issues once this is merged.

[RFC 14]: https://github.com/bytecodealliance/rfcs/pull/14
2021-09-27 12:27:19 -05:00
Alex Crichton
98831fe4e2 Update zeroize_derive to fix a rustsec warning (#3389)
Should hopefully appease CI
2021-09-24 15:07:16 -05:00
Alex Crichton
bfdbd10a13 Add *_unchecked variants of Func APIs for the C API (#3350)
* Add `*_unchecked` variants of `Func` APIs for the C API

This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to #3345 (sorry) but
also contains everything necessary to make the C API fast. As in #3345
the general idea is just moving checks out of the call path in the same
style of `TypedFunc`.

This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed `*mut u128` through
trampolines for arguments/results. This storage format is formalized
through a new `ValRaw` union that is exposed from the `wasmtime` crate.
By doing this it made it relatively easy to expose two new APIs:

* `Func::new_unchecked`
* `Func::call_unchecked`

These are the same as their checked equivalents except that they're
`unsafe` and they work with `*mut ValRaw` rather than safe slices of
`Val`. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.

These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin to `Func::wrap` in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.

For some benchmarks, the benchmarks here are:

* `nop` - Call a wasm function from the host that does nothing and
  returns nothing.
* `i64` - Call a wasm function from the host, the wasm function calls a
  host function, and the host function returns an `i64` all the way out to
  the original caller.
* `many` - Call a wasm function from the host, the wasm calls
   host function with 5 `i32` parameters, and then an `i64` result is
   returned back to the original host
* `i64` host - just the overhead of the wasm calling the host, so the
  wasm calls the host function in a loop.
* `many` host - same as `i64` host, but calling the `many` host function.

All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.

| Name      | Rust | C (before) | C (after) |
|-----------|------|------------|-----------|
| nop       | 19   | 112        | 25        |
| i64       | 22   | 207        | 32        |
| many      | 27   | 189        | 34        |
| i64 host  | 2    | 38         | 5         |
| many host | 7    | 75         | 8         |

The main conclusion here is that the C API is significantly faster than
before when using the `*_unchecked` variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.

* Tweak `wasmtime_externref_t` API comments
2021-09-24 14:05:45 -05:00
Chris Fallin
344a219245 Merge pull request #3383 from akirilov-arm/vany_true
Cranelift AArch64: Fix the VanyTrue implementation for 64-bit elements
2021-09-24 09:26:36 -07:00
Chris Fallin
26ef5128c4 Merge pull request #3387 from cfallin/cranelift-mtg-20211004
Cranelift: add agenda item for further discussion of ISLE in upcoming Oct 4 meeting.
2021-09-24 09:22:55 -07:00
Chris Fallin
1a778c2fe4 Cranelift: add agenda item for further discussion of ISLE in upcoming Oct 4 meeting. 2021-09-24 09:19:56 -07:00
Anton Kirilov
0fb3acfb94 Cranelift AArch64: Fix the VanyTrue implementation for 64-bit elements
Copyright (c) 2021, Arm Limited.
2021-09-23 20:39:46 +01:00
Chris Fallin
cf7ea71948 Merge pull request #3385 from akirilov-arm/fmaxmin_pseudo
Cranelift AArch64: Implement scalar FmaxPseudo and FminPseudo
2021-09-23 10:56:48 -07:00
Nick Fitzgerald
a2dbdfda1a Merge pull request #3386 from alexcrichton/allow-more-v8
Allow another trap mismatch with v8
2021-09-23 09:26:20 -07:00
Alex Crichton
476d0bee96 Allow another trap mismatch with v8
If Wasmtime thinks a module stack-overflows and v8 says that it does
something else that's ok. This means that the limits on v8 and Wasmtime
are different which is expected and not something we want fuzz-bugs
about.
2021-09-23 08:48:11 -07:00
Anton Kirilov
930b1f17f0 Cranelift AArch64: Implement scalar FmaxPseudo and FminPseudo
Copyright (c) 2021, Arm Limited.
2021-09-23 15:11:01 +01:00
Chris Fallin
144a0bfd83 Merge pull request #3382 from olivierlemasle/types-license
Add license file to wasmtime-types
2021-09-22 13:51:07 -07:00
Olivier Lemasle
b5e289d319 Add license file to wasmtime-types
The LICENSE file is missing in wasmtime-types crate.
As per the Apache 2.0 license, the license file itself
should be redistributed with the source code.
2021-09-22 22:08:35 +02:00
Nick Fitzgerald
2c75836553 Merge pull request #3381 from alexcrichton/more-disable-simd
Disable simd in the instantiate-swarm fuzz target
2021-09-22 10:07:46 -07:00
Chris Fallin
65fde3a86b Merge pull request #3380 from dheaton-arm/implement-iabs
Implement `Iabs` for the interpreter
2021-09-22 10:00:53 -07:00
Chris Fallin
b076c99af9 Merge pull request #3379 from dheaton-arm/implement-sqmulroundsat
Implement `SqmulRoundSat` for interpreter
2021-09-22 09:59:13 -07:00
Chris Fallin
dd7310df04 Merge pull request #3361 from dheaton-arm/implement-vecops
Implement `VhighBits` & `Vselect` for interpreter
2021-09-22 09:22:52 -07:00
Chris Fallin
76f9cfd79c Merge pull request #3354 from afonso360/interp-b
Add `bextend`,`breduce` and `bmask` to interpreter
2021-09-22 09:22:04 -07:00
Chris Fallin
3474965ca6 Merge pull request #3322 from sparker-arm/aarch64-lse-ops
AArch64 LSE atomic_rmw support
2021-09-22 09:21:28 -07:00
Alex Crichton
5cdaf3d085 Disable simd in the instantiate-swarm target
Something I forgot from #3376
2021-09-22 07:12:33 -07:00
dheaton-arm
faaf6b537a Prevent running tests on legacy backend.
Copyright (c) 2021, Arm Limited
2021-09-22 13:50:31 +01:00
dheaton-arm
539b1de5f4 Prevent test running on legacy backend.
Copyright (c) 2021, Arm Limited
2021-09-22 13:48:59 +01:00
dheaton-arm
cb30ecc7bc Implement Iabs for the interpreter
Implemented `Iabs` to return the absolute integer value with wrapping.

Copyright (c) 2021, Arm Limited
2021-09-22 12:59:30 +01:00
dheaton-arm
02ff19f2fc Implement SqmulRoundSat for interpreter
Implemented `SqmulRoundSat` for the Cranelift interpreter, performing
QN-format fixed point multiplication for 16 and 32-bit integers in
SIMD vectors.

Copyright (c) 2021, Arm Limited
2021-09-22 12:58:41 +01:00
dheaton-arm
63d85e1dc3 Prevent running simd-vhighbits.clif on legacy backend.
Copyright (c) 2021, Arm Limited.
2021-09-22 11:43:57 +01:00
dheaton-arm
335177a97e Remove legacy backend from test
Copyright (c) 2021, Arm Limited
2021-09-22 09:42:18 +01:00
Alex Crichton
1a5a2c7c5d Fix a merge conflict on main (#3378)
This commit fixes a "merge conflict" with #3319 being merged into
`main`, causing CI failures on merge.
2021-09-21 15:30:07 -05:00
Alex Crichton
bcf3544924 Optimize Func::call and its C API (#3319)
* Optimize `Func::call` and its C API

This commit is an alternative to #3298 which achieves effectively the
same goal of optimizing the `Func::call` API as well as its C API
sibling of `wasmtime_func_call`. The strategy taken here is different
than #3298 though where a new API isn't created, rather a small tweak to
an existing API is done. Specifically this commit handles the major
sources of slowness with `Func::call` with:

* Looking up the type of a function, to typecheck the arguments with and
  use to guide how the results should be loaded, no longer hits the
  rwlock in the `Engine` but instead each `Func` contains its own
  `FuncType`. This can be an unnecessary allocation for funcs not used
  with `Func::call`, so this is a downside of this implementation
  relative to #3298. A mitigating factor, though, is that instance
  exports are loaded lazily into the `Store` and in theory not too many
  funcs are active in the store as `Func` objects.

* Temporary storage is amortized with a long-lived `Vec` in the `Store`
  rather than allocating a new vector on each call. This is basically
  the same strategy as #3294 only applied to different types in
  different places. Specifically `wasmtime::Store` now retains a
  `Vec<u128>` for `Func::call`, and the C API retains a `Vec<Val>` for
  calling `Func::call`.

* Finally, an API breaking change is made to `Func::call` and its type
  signature (as well as `Func::call_async`). Instead of returning
  `Box<[Val]>` as it did before this function now takes a
  `results: &mut [Val]` parameter. This allows the caller to manage the
  allocation and we can amortize-remove it in `wasmtime_func_call` by
  using space after the parameters in the `Vec<Val>` we're passing in.
  This change is naturally a breaking change and we'll want to consider
  it carefully, but mitigating factors are that most embeddings are
  likely using `TypedFunc::call` instead and this signature taking a
  mutable slice better aligns with `Func::new` which receives a mutable
  slice for the results.

Overall this change, in the benchmark of "call a nop function from the C
API" is not quite as good as #3298. It's still a bit slower, on the
order of 15ns, because there's lots of capacity checks around vectors
and the type checks are slightly less optimized than before. Overall
though this is still significantly better than today because allocations
and the rwlock to acquire the type information are both avoided. I
personally feel that this change is the best to do because it has less
of an API impact than #3298.

* Rebase issues
2021-09-21 14:07:05 -05:00
Alex Crichton
38463d11ed Load generated trampolines into jitdump when profiling (#3344)
* Load generated trampolines into jitdump when profiling

This commit updates the jitdump profiler to generate JIT profiling
records for generated trampolines in a wasm module in addition to the
functions already in a module. It's also updated to learn about
trampolines generated via `Func::new` and friends. These trampolines
were all not previously registered meaning that stack traces with these
pc values would be confusing to see in the profile output. While the
names aren't the best it should at least be more clear than before if a
function is hot!

* Fix more builds
2021-09-21 13:05:31 -05:00
Afonso Bordado
9a95ce75f1 cranelift: Add bmask to interpreter 2021-09-21 18:43:53 +01:00
Afonso Bordado
3ee180420e cranelift: Add breduce tests to interpreter 2021-09-21 18:21:48 +01:00