wasmtime

Author	SHA1	Message	Date
bjorn3	9e34df33b9	Remove the old x86 backend	2021-09-29 16:13:46 +02:00
Alex Crichton	e989caf337	Update `RELEASES.md` (#3392 ) Not exhaustive for the next release but figured I'd get a head-start.	2021-09-28 10:58:29 -05:00
Alex Crichton	1ee2af0098	Remove the lightbeam backend (#3390 ) This commit removes the Lightbeam backend from Wasmtime as per [RFC 14]. This backend hasn't received maintenance in quite some time, and as [RFC 14] indicates this doesn't meet the threshold for keeping the code in-tree, so this commit removes it. A fast "baseline" compiler may still be added in the future. The addition of such a backend should be in line with [RFC 14], though, with the principles we now have for stable releases of Wasmtime. I'll close out Lightbeam-related issues once this is merged. [RFC 14]: https://github.com/bytecodealliance/rfcs/pull/14	2021-09-27 12:27:19 -05:00
Alex Crichton	98831fe4e2	Update zeroize_derive to fix a rustsec warning (#3389 ) Should hopefully appease CI	2021-09-24 15:07:16 -05:00
Alex Crichton	bfdbd10a13	Add `_unchecked` variants of `Func` APIs for the C API (#3350 ) Add `_unchecked` variants of `Func` APIs for the C API This commit is what is hopefully going to be my last installment within the saga of optimizing function calls in/out of WebAssembly modules in the C API. This is yet another alternative approach to #3345 (sorry) but also contains everything necessary to make the C API fast. As in #3345 the general idea is just moving checks out of the call path in the same style of `TypedFunc`. This new strategy takes inspiration from previously learned attempts effectively "just" exposes how we previously passed `mut u128` through trampolines for arguments/results. This storage format is formalized through a new `ValRaw` union that is exposed from the `wasmtime` crate. By doing this it made it relatively easy to expose two new APIs: * `Func::new_unchecked` * `Func::call_unchecked` These are the same as their checked equivalents except that they're `unsafe` and they work with `mut ValRaw` rather than safe slices of `Val`. Working with these eschews type checks and such and requires callers/embedders to do the right thing. These two new functions are then exposed via the C API with new functions, enabling C to have a fast-path of calling/defining functions. This fast path is akin to `Func::wrap` in Rust, although that API can't be built in C due to C not having generics in the same way that Rust has. For some benchmarks, the benchmarks here are: `nop` - Call a wasm function from the host that does nothing and returns nothing. * `i64` - Call a wasm function from the host, the wasm function calls a host function, and the host function returns an `i64` all the way out to the original caller. * `many` - Call a wasm function from the host, the wasm calls host function with 5 `i32` parameters, and then an `i64` result is returned back to the original host * `i64` host - just the overhead of the wasm calling the host, so the wasm calls the host function in a loop. * `many` host - same as `i64` host, but calling the `many` host function. All numbers in this table are in nanoseconds, and this is just one measurement as well so there's bound to be some variation in the precise numbers here. \| Name \| Rust \| C (before) \| C (after) \| \|-----------\|------\|------------\|-----------\| \| nop \| 19 \| 112 \| 25 \| \| i64 \| 22 \| 207 \| 32 \| \| many \| 27 \| 189 \| 34 \| \| i64 host \| 2 \| 38 \| 5 \| \| many host \| 7 \| 75 \| 8 \| The main conclusion here is that the C API is significantly faster than before when using the `_unchecked` variants of APIs. The Rust implementation is still the ceiling (or floor I guess?) for performance The main reason that C is slower than Rust is that a little bit more has to travel through memory where on the Rust side of things we can monomorphize and inline a bit more to get rid of that. Overall though the costs are way way down from where they were originally and I don't plan on doing a whole lot more myself at this time. There's various things we theoretically could do I've considered but implementation-wise I think they'll be much more weighty. Tweak `wasmtime_externref_t` API comments	2021-09-24 14:05:45 -05:00
Chris Fallin	344a219245	Merge pull request #3383 from akirilov-arm/vany_true Cranelift AArch64: Fix the VanyTrue implementation for 64-bit elements	2021-09-24 09:26:36 -07:00
Chris Fallin	26ef5128c4	Merge pull request #3387 from cfallin/cranelift-mtg-20211004 Cranelift: add agenda item for further discussion of ISLE in upcoming Oct 4 meeting.	2021-09-24 09:22:55 -07:00
Chris Fallin	1a778c2fe4	Cranelift: add agenda item for further discussion of ISLE in upcoming Oct 4 meeting.	2021-09-24 09:19:56 -07:00
Anton Kirilov	0fb3acfb94	Cranelift AArch64: Fix the VanyTrue implementation for 64-bit elements Copyright (c) 2021, Arm Limited.	2021-09-23 20:39:46 +01:00
Chris Fallin	cf7ea71948	Merge pull request #3385 from akirilov-arm/fmaxmin_pseudo Cranelift AArch64: Implement scalar FmaxPseudo and FminPseudo	2021-09-23 10:56:48 -07:00
Nick Fitzgerald	a2dbdfda1a	Merge pull request #3386 from alexcrichton/allow-more-v8 Allow another trap mismatch with v8	2021-09-23 09:26:20 -07:00
Alex Crichton	476d0bee96	Allow another trap mismatch with v8 If Wasmtime thinks a module stack-overflows and v8 says that it does something else that's ok. This means that the limits on v8 and Wasmtime are different which is expected and not something we want fuzz-bugs about.	2021-09-23 08:48:11 -07:00
Anton Kirilov	930b1f17f0	Cranelift AArch64: Implement scalar FmaxPseudo and FminPseudo Copyright (c) 2021, Arm Limited.	2021-09-23 15:11:01 +01:00
Chris Fallin	144a0bfd83	Merge pull request #3382 from olivierlemasle/types-license Add license file to wasmtime-types	2021-09-22 13:51:07 -07:00
Olivier Lemasle	b5e289d319	Add license file to wasmtime-types The LICENSE file is missing in wasmtime-types crate. As per the Apache 2.0 license, the license file itself should be redistributed with the source code.	2021-09-22 22:08:35 +02:00
Nick Fitzgerald	2c75836553	Merge pull request #3381 from alexcrichton/more-disable-simd Disable simd in the instantiate-swarm fuzz target	2021-09-22 10:07:46 -07:00
Chris Fallin	65fde3a86b	Merge pull request #3380 from dheaton-arm/implement-iabs Implement `Iabs` for the interpreter	2021-09-22 10:00:53 -07:00
Chris Fallin	b076c99af9	Merge pull request #3379 from dheaton-arm/implement-sqmulroundsat Implement `SqmulRoundSat` for interpreter	2021-09-22 09:59:13 -07:00
Chris Fallin	dd7310df04	Merge pull request #3361 from dheaton-arm/implement-vecops Implement `VhighBits` & `Vselect` for interpreter	2021-09-22 09:22:52 -07:00
Chris Fallin	76f9cfd79c	Merge pull request #3354 from afonso360/interp-b Add `bextend`,`breduce` and `bmask` to interpreter	2021-09-22 09:22:04 -07:00
Chris Fallin	3474965ca6	Merge pull request #3322 from sparker-arm/aarch64-lse-ops AArch64 LSE atomic_rmw support	2021-09-22 09:21:28 -07:00
Alex Crichton	5cdaf3d085	Disable simd in the instantiate-swarm target Something I forgot from #3376	2021-09-22 07:12:33 -07:00
dheaton-arm	faaf6b537a	Prevent running tests on legacy backend. Copyright (c) 2021, Arm Limited	2021-09-22 13:50:31 +01:00
dheaton-arm	539b1de5f4	Prevent test running on legacy backend. Copyright (c) 2021, Arm Limited	2021-09-22 13:48:59 +01:00
dheaton-arm	cb30ecc7bc	Implement `Iabs` for the interpreter Implemented `Iabs` to return the absolute integer value with wrapping. Copyright (c) 2021, Arm Limited	2021-09-22 12:59:30 +01:00
dheaton-arm	02ff19f2fc	Implement `SqmulRoundSat` for interpreter Implemented `SqmulRoundSat` for the Cranelift interpreter, performing QN-format fixed point multiplication for 16 and 32-bit integers in SIMD vectors. Copyright (c) 2021, Arm Limited	2021-09-22 12:58:41 +01:00
dheaton-arm	63d85e1dc3	Prevent running `simd-vhighbits.clif` on legacy backend. Copyright (c) 2021, Arm Limited.	2021-09-22 11:43:57 +01:00
dheaton-arm	335177a97e	Remove legacy backend from test Copyright (c) 2021, Arm Limited	2021-09-22 09:42:18 +01:00
Alex Crichton	1a5a2c7c5d	Fix a merge conflict on `main` (#3378 ) This commit fixes a "merge conflict" with #3319 being merged into `main`, causing CI failures on merge.	2021-09-21 15:30:07 -05:00
Alex Crichton	bcf3544924	Optimize `Func::call` and its C API (#3319 ) * Optimize `Func::call` and its C API This commit is an alternative to #3298 which achieves effectively the same goal of optimizing the `Func::call` API as well as its C API sibling of `wasmtime_func_call`. The strategy taken here is different than #3298 though where a new API isn't created, rather a small tweak to an existing API is done. Specifically this commit handles the major sources of slowness with `Func::call` with: * Looking up the type of a function, to typecheck the arguments with and use to guide how the results should be loaded, no longer hits the rwlock in the `Engine` but instead each `Func` contains its own `FuncType`. This can be an unnecessary allocation for funcs not used with `Func::call`, so this is a downside of this implementation relative to #3298. A mitigating factor, though, is that instance exports are loaded lazily into the `Store` and in theory not too many funcs are active in the store as `Func` objects. * Temporary storage is amortized with a long-lived `Vec` in the `Store` rather than allocating a new vector on each call. This is basically the same strategy as #3294 only applied to different types in different places. Specifically `wasmtime::Store` now retains a `Vec<u128>` for `Func::call`, and the C API retains a `Vec<Val>` for calling `Func::call`. * Finally, an API breaking change is made to `Func::call` and its type signature (as well as `Func::call_async`). Instead of returning `Box<[Val]>` as it did before this function now takes a `results: &mut [Val]` parameter. This allows the caller to manage the allocation and we can amortize-remove it in `wasmtime_func_call` by using space after the parameters in the `Vec<Val>` we're passing in. This change is naturally a breaking change and we'll want to consider it carefully, but mitigating factors are that most embeddings are likely using `TypedFunc::call` instead and this signature taking a mutable slice better aligns with `Func::new` which receives a mutable slice for the results. Overall this change, in the benchmark of "call a nop function from the C API" is not quite as good as #3298. It's still a bit slower, on the order of 15ns, because there's lots of capacity checks around vectors and the type checks are slightly less optimized than before. Overall though this is still significantly better than today because allocations and the rwlock to acquire the type information are both avoided. I personally feel that this change is the best to do because it has less of an API impact than #3298. * Rebase issues	2021-09-21 14:07:05 -05:00
Alex Crichton	38463d11ed	Load generated trampolines into jitdump when profiling (#3344 ) * Load generated trampolines into jitdump when profiling This commit updates the jitdump profiler to generate JIT profiling records for generated trampolines in a wasm module in addition to the functions already in a module. It's also updated to learn about trampolines generated via `Func::new` and friends. These trampolines were all not previously registered meaning that stack traces with these pc values would be confusing to see in the profile output. While the names aren't the best it should at least be more clear than before if a function is hot! * Fix more builds	2021-09-21 13:05:31 -05:00
Afonso Bordado	9a95ce75f1	cranelift: Add `bmask` to interpreter	2021-09-21 18:43:53 +01:00
Afonso Bordado	3ee180420e	cranelift: Add `breduce` tests to interpreter	2021-09-21 18:21:48 +01:00
Afonso Bordado	c7d595ae46	cranelift: Add `bextend` tests to interpreter	2021-09-21 18:21:48 +01:00
Chris Fallin	38728c5746	Merge pull request #3362 from dheaton-arm/implement-unarrow Implement `Unarrow`, `Uunarrow`, and `Snarrow` for the interpreter	2021-09-21 10:06:46 -07:00
Chris Fallin	e0bd4bd007	Merge pull request #3363 from dheaton-arm/implement-widening-pairwise-dotprod Implement `WideningPairwiseDotProductS` for interpreter	2021-09-21 10:05:07 -07:00
Chris Fallin	ebe2af6eaa	Merge pull request #3351 from afonso360/parser-i128 cranelift: Add support for parsing i128 data values	2021-09-21 10:04:27 -07:00
Alex Crichton	fc6328ae06	Temporarily disable SIMD fuzzing on CI (#3376 ) We've got a large crop of fuzz-bugs from fuzzing with enabled-with-SIMD on oss-fuzz but at this point the fuzz stats from oss-fuzz say that the fuzzers like v8 are spending less than 50% of its time actually fuzzing and presumably mostly hitting crashes and such. While we fix the other issues this disables simd for fuzzing with v8 so we can try to see if we can weed out other issues.	2021-09-20 14:17:19 -05:00
Ulrich Weigand	3735453afa	Add s390x build workflow (#3375 )	2021-09-20 12:42:26 -05:00
Alex Crichton	5d3012d8f0	Cranelift 9/20 meeting notes (#3374 ) * Cranelift 9/20 meeting notes * Update cranelift-09-20.md	2021-09-20 12:34:27 -05:00
Advance Software	a8467d0824	Exports symbols to be shared with external GDB/JIT debugging interfac… (#3373 ) * Exports symbols to be shared with external GDB/JIT debugging interface tools. Windows O/S specific requirement. * Moved comments into platform specific compiler directive sections.	2021-09-20 12:33:20 -05:00
Chris Fallin	f958c01444	Merge pull request #3372 from uweigand/s390x-ci Add QEMU CI runner for the s390x architecture	2021-09-20 09:14:34 -07:00
Ulrich Weigand	7c5acfa96c	Add QEMU CI runner for the s390x architecture * Add QEMU CI runner for s390x * Disable lightbeam tests for s390x	2021-09-20 17:19:04 +02:00
Dan Gohman	87ff24a4aa	Use `__builtin_setjmp` instead of `sigsetjmp`. (#3360 ) * Use `__builtin_setjmp` instead of `sigsetjmp`. Use [`__builtin_setjmp`] instead of `sigsetjmp`, as it is implemented in the compiler, performed inline, and saves much less state. This speeds up calls into wasm by about 8% on my machine. [`__builtin_setjmp`]: https://gcc.gnu.org/onlinedocs/gcc/Nonlocal-Gotos.html * Add a comment confirming that 5 really is the documented size. * Add a comment about callee-saved state and __builtin_setjmp. * On clang on aarch64, use sigsetjmp. * Fix a stray `#endif`.	2021-09-20 09:14:52 -05:00
Ulrich Weigand	51131a3acc	Fix s390x regressions (#3330 ) - Add relocation handling needed after PR #3275 - Fix incorrect handling of signed constants detected by PR #3056 test - Fix LabelUse max pos/neg ranges; fix overflow in buffers.rs - Disable fuzzing tests that require pre-built v8 binaries - Disable cranelift test that depends on i128 - Temporarily disable memory64 tests	2021-09-20 09:12:36 -05:00
Denis	9eae88a97a	Added link to C++ Conan package (#3368 )	2021-09-20 09:11:42 -05:00
dheaton-arm	8abb19cbd8	Generate `new_vec` using an iterator chain Copyright (c) 2021, Arm Limited	2021-09-20 10:31:34 +01:00
dheaton-arm	3fc29f5f6c	Return `u128` from `bounds`; form `new_vec` from iter chain Copyright (c) 2021, Arm Limited	2021-09-20 09:57:19 +01:00
Afonso Bordado	3a4ebd7727	cranelift: Deduplicate match_imm functions Transforming this into a generic function is proving to be a challenge since most of the necessary methods are not in a trait. We also need to cast between the signed and unsigned types, which is difficult to do in a generic function. This can be solved for example by adding the num crate as a dependency. But adding a dependency just to solve this issue seems a bit much.	2021-09-19 15:03:46 +01:00
Afonso Bordado	eae1b2d246	cranelift: Update i128 tests to use i128 values in functions	2021-09-19 15:02:06 +01:00

1 2 3 4 5 ...

8929 Commits