wasmtime

Author	SHA1	Message	Date
Damian Heaton	e463890f26	Port `AvgRound` & `SqmulRoundSat` to ISLE (AArch64) (#4639 ) Ported the existing implementations of the following opcodes on AArch64 to ISLE: - `AvgRound` - Also introduced support for `i64x2` vectors, as per the docs. - `SqmulRoundSat` Copyright (c) 2022 Arm Limited	2022-08-08 11:35:43 -07:00
Nick Fitzgerald	95e72db458	Some little Cranelift logging things (#4624 ) * Cranelift: Don't print "skipped TEST can't run aarch64" on x64, etc It's way too noisy. Move it to the logs. * Cranelift: Enable Cranelift trace logs in `clif-util` by default * cranelift-filetest: use `log::warn!` for warnings Instead of `println!` * rustfmt	2022-08-05 13:25:24 -07:00
Damian Heaton	eb332b8369	Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) (#4608 ) * Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) Ported the existing implementations of the following opcodes to ISLE on AArch64: - `fma` - Introduced missing support for `fma` on vector values, as per the docs. - `valltrue` - `vanytrue` Also fixed `fcmp` on scalar values in the interpreter, and enabled interpreter tests in `simd-fma.clif`. This introduces the `FMLA` machine instruction. Copyright (c) 2022 Arm Limited * Add comments for `Fmla` and `Bsl` Copyright (c) 2022 Arm Limited	2022-08-05 09:47:56 -07:00
wasmtime-publish	412fa04911	Bump Wasmtime to 0.41.0 (#4620 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2022-08-04 20:02:19 -05:00
Ulrich Weigand	f552a53654	s390x: Implement bitrev (#4617 ) Since we do not have an instruction for this, this is a simple open-coded implementation. Needed by the cg_clif frontend.	2022-08-04 16:24:55 -07:00
Trevor Elliott	cd847d071d	x64: Migrate br_table to ISLE (#4615 ) https://github.com/bytecodealliance/wasmtime/pull/4615	2022-08-04 22:12:37 +00:00
Ulrich Weigand	b17b1eb25d	[s390x, abi_impl] Add i128 support (#4598 ) This adds full i128 support to the s390x target, including new filetests and enabling the existing i128 runtest on s390x. The ABI requires that i128 is passed and returned via implicit pointer, but the front end still generates direct i128 types in call. This means we have to implement ABI support to implicitly convert i128 types to pointers when passing arguments. To do so, we add a new variant ABIArg::ImplicitArg. This acts like StructArg, except that the value type is the actual target type, not a pointer type. The required conversions have to be inserted in the prologue and at function call sites. Note that when dereferencing the implicit pointer in the prologue, we may require a temp register: the pointer may be passed on the stack so it needs to be loaded first, but the value register may be in the wrong class for pointer values. In this case, we use the "stack limit" register, which should be available at this point in the prologue. For return values, we use a mechanism similar to the one used for supporting multiple return values in the Wasmtime ABI. The only difference is that the hidden pointer to the return buffer must be the first, not last, argument in this case. (This implements the second half of issue #4565.)	2022-08-04 20:41:26 +00:00
Trevor Elliott	1fc11bbe51	x64: Migrate brff and I128 branching instructions to ISLE (#4599 ) https://github.com/bytecodealliance/wasmtime/pull/4599	2022-08-04 08:58:50 -07:00
Ulrich Weigand	b9dd48e34b	[s390x, abi_impl] Support struct args using explicit pointers (#4585 ) This adds support for StructArgument on s390x. The ABI for this platform requires that the address of the buffer holding the copy of the struct argument is passed from caller to callee as hidden pointer, using a register or overflow stack slot. To implement this, I've added an optional "pointer" filed to ABIArg::StructArg, and code to handle the pointer both in common abi_impl code and the s390x back-end. One notable change necessary to make this work involved the "copy_to_arg_order" mechanism. Currently, for struct args we only need to copy the data (and that need to happen before setting up any other args), while for non-struct args we only need to set up the appropriate registers or stack slots. This order is ensured by sorting the arguments appropriately into a "copy_to_arg_order" list. However, for struct args with explicit pointers we need to both copy the data (again, before everything else), and set up a register or stack slot. Since we now need to touch the argument twice, we cannot solve the ordering problem by a simple sort. Instead, the abi_impl common code now provided two callbacks, emit_copy_regs_to_buffer and emit_copy_regs_to_arg, and expects the back end to first call copy..to_buffer for all args, and then call copy.._to_arg for all args. This required updates to all back ends. In the s390x back end, in addition to the new ABI code, I'm now adding code to actually copy the struct data, using the MVC instruction (for small buffers) or a memcpy libcall (for larger buffers). This also requires a bit of new infrastructure: - MVC is the first memory-to-memory instruction we use, which needed a bit of memory argument tweaking - We also need to set up the infrastructure to emit libcalls. (This implements the first half of issue #4565.)	2022-08-03 19:00:07 +00:00
Anton Kirilov	a897742593	Initial back-edge CFI implementation (#3606 ) Give the user the option to sign and to authenticate function return addresses with the operations introduced by the Pointer Authentication extension to the Arm instruction set architecture. Copyright (c) 2021, Arm Limited.	2022-08-03 11:08:29 -07:00
Nick Fitzgerald	42bba452a6	Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573 ) * Cranelift: Add instructions for getting the current stack/frame pointers and return address This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535 * x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead We just special case getting operands from `Amode`s now. * Fix s390x `get_return_address`; require `preserve_frame_pointers=true` * Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp * Handle non-allocatable registers in Amode::with_allocs * Use "stack" instead of "r15" on s390x * r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`	2022-08-02 14:37:17 -07:00
Ulrich Weigand	6b4e6523f7	[abi_impl] Respect extension for incoming stack arguments (#4576 ) The gen_copy_arg_to_regs routine currently ignores argument extension flags when loading incoming arguments. This causes a problem with stack arguments on big-endian systems, since the argument address points to the word on the stack as extended by the caller, but the generated code only loads the inner type from the address, causing it to receive an incorrect value. (This happens to work on little- endian systems.) Fixed by loading extended arguments as full words.	2022-08-02 13:54:13 -07:00
Chris Fallin	8dddd6f1f7	Cranelift: Remove `ifcmp_sp` opcode. (#4578 ) This was temporarily added back in #3502 due to a need from Lucet; now that Lucet is EOL, the opcode is no longer needed and we can remove it.	2022-08-02 13:15:39 -07:00
Chris Fallin	43f1765272	Cranellift: remove Baldrdash support and related features. (#4571 ) * Cranellift: remove Baldrdash support and related features. As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team has recently determined that their current form of integration with Cranelift is too hard to maintain, and they have chosen to remove it from their codebase. If and when they decide to build updated support for Cranelift, they will adopt different approaches to several details of the integration. In the meantime, after discussion with the SpiderMonkey folks, they agree that it makes sense to remove the bits of Cranelift that exist to support the integration ("Baldrdash"), as they will not need them. Many of these bits are difficult-to-maintain special cases that are not actually tested in Cranelift proper: for example, the Baldrdash integration required Cranelift to emit function bodies without prologues/epilogues, and instead communicate very precise information about the expected frame size and layout, then stitched together something post-facto. This was brittle and caused a lot of incidental complexity ("fallthrough returns", the resulting special logic in block-ordering); this is just one example. As another example, one particular Baldrdash ABI variant processed stack args in reverse order, so our ABI code had to support both traversal orders. We had a number of other Baldrdash-specific settings as well that did various special things. This PR removes Baldrdash ABI support, the `fallthrough_return` instruction, and pulls some threads to remove now-unused bits as a result of those two, with the understanding that the SpiderMonkey folks will build new functionality as needed in the future and we can perhaps find cleaner abstractions to make it all work. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425 * Review feedback. * Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations. The debugger tests invoke `wasmtime` from within each test case under the control of a debugger (gdb or lldb). Some of these tests started to inexplicably fail in CI with unrelated changes, and the failures were only inconsistently reproducible locally. It seems to be cache related: if we disable cached compilation on the nested `wasmtime` invocations, the tests consistently pass. * Review feedback.	2022-08-02 19:37:56 +00:00
Benjamin Bouvier	ff37c9d8a4	[cranelift] Rejigger the `compile` API (#4540 ) * Move `emit_to_memory` to `MachCompileResult` This small refactoring makes it clearer to me that emitting to memory doesn't require anything else from the compilation `Context`. While it's a trivial change, it's a small public API change that shouldn't cause too much trouble, and doesn't seem RFC-worthy. Happy to hear different opinions about this, though! * hide the MachCompileResult behind a method * Add a `CompileError` wrapper type that references a `Function` * Rename MachCompileResult to CompiledCode * Additionally remove the last unsafe API in cranelift-codegen	2022-08-02 12:05:40 -07:00
Sam Parker	37cd96beff	[AArch64] i64x2 support for min/max (#4575 ) Also added interpreter support for vector min/max. Copyright (c) 2022, Arm Limited.	2022-08-02 11:42:05 -07:00
Trevor Elliott	25782b527e	x64: Migrate trapif and trapff to ISLE (#4545 ) https://github.com/bytecodealliance/wasmtime/pull/4545	2022-08-01 11:24:11 -07:00
Anton Kirilov	a47a82d2e5	Cranelift AArch64: Harden the Spectre mitigations (#4555 ) Use the `CSDB` instruction following Arm's recommendation. Copyright (c) 2022, Arm Limited.	2022-08-01 10:20:48 -07:00
Afonso Bordado	1f058a02c0	cranelift: Add MinGW `fma` regression tests (#4517 ) * cranelift: Add MinGW `fma` regression tests * cranelift: Fix FMA in interpreter * cranelift: Add separate `fma` test suite for the interpreter The interpreter can run `fma.clif` on most platforms, however on `x86_64-pc-windows-gnu` we use libm which has issues with some inputs. We should delete `fma-interpreter.clif` and enable the interpreter on the main `fma.clif` file once those are fixed.	2022-07-29 09:09:37 -05:00
Damian Heaton	5e3bb588a8	Port `Fence`, `IsNull`/`IsInvalid` & `Debugtrap` to ISLE (AArch64) (#4548 ) Ported the existing implementation of the following Opcodes for AArch64 to ISLE: - `Fence` - `IsNull` - `IsInvalid` - `Debugtrap` Copyright (c) 2022 Arm Limited	2022-07-28 15:36:13 -07:00
Trevor Elliott	29d4edc76b	x64: Migrate call and call_indirect to ISLE (#4542 ) https://github.com/bytecodealliance/wasmtime/pull/4542	2022-07-28 13:10:03 -07:00
Afonso Bordado	0508932174	cranelift: Align Scalar and SIMD shift semantics (#4520 ) * cranelift: Reorganize test suite Group some SIMD operations by instruction. * cranelift: Deduplicate some shift tests Also, new tests with the mod behaviour * aarch64: Lower shifts with mod behaviour * x64: Lower shifts with mod behaviour * wasmtime: Don't mask SIMD shifts	2022-07-27 17:54:00 +00:00
Afonso Bordado	e121c209fc	cranelift: Fix `urem`/`srem` in interpreter (#4532 )	2022-07-27 10:47:08 -07:00
Anton Kirilov	ead6edb0c5	Cranelift AArch64: Migrate Splat to ISLE (#4521 ) Copyright (c) 2022, Arm Limited.	2022-07-26 17:57:15 +00:00
Anton Kirilov	d041c4b376	Cranelift AArch64: Further integral constant fixes (#4530 ) Copyright (c) 2022, Arm Limited.	2022-07-26 09:35:06 -07:00
Afonso Bordado	02c3b47db2	x64: Implement SIMD `fma` (#4474 ) * x64: Add VEX Instruction Encoder This uses a similar builder pattern to the EVEX Encoder. Does not yet support memory accesses. * x64: Add FMA Flag * x64: Implement SIMD `fma` * x64: Use 4 register Vex Inst * x64: Reorder VEX pretty print args	2022-07-25 22:01:02 +00:00
Damian Heaton	3ef89b7787	Allow 64-bit vectors and implement for interpreter (#4509 ) * Allow 64-bit vectors and implement for interpreter The AArch64 backend already supports 64-bit vectors; this simply allows instructions to make use of that. Implemented support for 64-bit vectors within the interpreter to allow interpret runtests to use them. Copyright (c) 2022 Arm Limited * Disable 64-bit SIMD `iaddpairwise` tests on s390x Copyright (c) 2022 Arm Limited	2022-07-25 13:00:43 -07:00
Sam Parker	c5ddb4b803	[AArch64] Port SIMD narrowing to ISLE (#4478 ) * [AArch64] Port SIMD narrowing to ISLE Fvdemote, snarrow, unarrow and uunarrow. Also refactor the aarch64 instructions descriptions to parameterize on ScalarSize instead of using different opcodes. The zero_value pure constructor has been introduced and used by the integer narrow operations and it replaces, and extends, the compare zero patterns. Copright (c) 2022, Arm Limited. * use short 'if' patterns	2022-07-25 12:40:36 -07:00
Ulrich Weigand	dd40bf075a	s390x: Enable more runtests, and fix a few bugs (#4516 ) This enables more runtests to be executed on s390x. Doing so uncovered a two back-end bugs, which are fixed as well: - The result of cls was always off by one. - The result of popcnt.i16 has uninitialized high bits. In addition, I found a bug in the load-op-store.clif test case: v3 = heap_addr.i64 heap0, v1, 4 v4 = iconst.i64 42 store.i32 v4, v3 This was clearly intended to perform a 32-bit store, but actually performs a 64-bit store (it seems the type annotation of the store opcode is ignored, and the type of the operand is used instead). That bug did not show any noticable symptoms on little-endian architectures, but broke on big-endian.	2022-07-25 12:37:06 -07:00
Trevor Elliott	ee7e4f4c6b	x64: Port func_addr and symbol_value to ISLE (#4485 ) https://github.com/bytecodealliance/wasmtime/pull/4485	2022-07-25 11:11:16 -07:00
Afonso Bordado	446efd3e11	cranelift: Fix `icmp_imm` for small types in interpreter (#4506 )	2022-07-23 00:26:56 +00:00
Afonso Bordado	af62037f62	cranelift: Restrict `br_table` to `i32` indices (#4510 ) * cranelift: Restrict `br_table` to `i32` indices In #4498 it was proposed that we should only accept `i32` indices to `br_table`. The rationale for this is that larger types lead the users to a false sense of flexibility (since we don't support jump tables larger than u32's), and narrower types are not well tested paths that would be safer if we removed them. * cranelift: Reduce directly from i128 to i32 in Switch	2022-07-22 23:32:40 +00:00
Damian Heaton	f1a0c40a53	Convert `sqrt`..`nearest` to ISLE (AArch64) (#4508 ) Converted the existing implementations for the following opcodes to ISLE on AArch64: - `sqrt` - `fneg` - `fabs` - `fpromote` - `fdemote` - `ceil` - `floor` - `trunc` - `nearest` Copyright (c) 2022 Arm Limited	2022-07-22 14:48:07 -07:00
Afonso Bordado	d89c262657	cranelift: Implement `{u,s}extend.i128` in interpreter (#4505 )	2022-07-22 10:47:10 -07:00
Ulrich Weigand	fd639dd044	s390x: Support preserve_frame_pointers flag (#4477 ) On s390x, we do not have a frame pointer that can be used to chain stack frames for easy unwinding. Instead, our ABI defines a stack "backchain" mechanism that can be used to the same effect. This PR uses that backchain mechanism to implement the new preserve_frame_pointers flags introduced here: https://github.com/bytecodealliance/wasmtime/pull/4469	2022-07-21 10:09:16 -07:00
Anton Kirilov	2ba4bce5cc	Merge pull request from GHSA-7f6x-jwh5-m9r4 Copyright (c) 2022, Arm Limited.	2022-07-20 11:53:56 -05:00
Nick Fitzgerald	22d91a7c84	cranelift: Add a flag for preserving frame pointers (#4469 ) Preserving frame pointers -- even inside leaf functions -- makes it easy to capture the stack of a running program, without requiring any side tables or metadata (like `.eh_frame` sections). Many sampling profilers and similar tools walk frame pointers to capture stacks. Enabling this option will play nice with those tools.	2022-07-20 08:02:21 -07:00
Damian Heaton	00ac18c866	Convert `fadd`..`fmax_pseudo` to ISLE (AArch64) (#4452 ) Converted the existing implementations for the following Opcodes to ISLE on AArch64: - `fadd` - `fsub` - `fmul` - `fdiv` - `fmin` - `fmax` - `fmin_pseudo` - `fmax_pseudo` Copyright (c) 2022 Arm Limited	2022-07-19 12:03:05 -07:00
Trevor Elliott	b519c975cb	x64: Port fdemote and fvdemote to ISLE (#4449 ) https://github.com/bytecodealliance/wasmtime/pull/4449	2022-07-18 14:26:23 -07:00
Ulrich Weigand	638dc4e0b3	s390x: Implement full SIMD support (#4427 ) This adds full support for all Cranelift SIMD instructions to the s390x target. Everything is matched fully via ISLE. In addition to adding support for many new instructions, and the lower.isle code to match all SIMD IR patterns, this patch also adds ABI support for vector types. In particular, we now need to handle the fact that vector registers 8 .. 15 are partially callee-saved, i.e. the high parts of those registers (which correspond to the old floating-poing registers) are callee-saved, but the low parts are not. This is the exact same situation that we already have on AArch64, and so this patch uses the same solution (the is_included_in_clobbers callback). The bulk of the changes are platform-specific, but there are a few exceptions: - Added ISLE extractors for the Immediate and Constant types, to enable matching the vconst and swizzle instructions. - Added a missing accessor for call_conv to ABISig. - Fixed endian conversion for vector types in data_value.rs to enable their use in runtests on the big-endian platforms. - Enabled (nearly) all SIMD runtests on s390x. [ Two test cases remain disabled due to vector shift count semantics, see below. ] - Enabled all Wasmtime SIMD tests on s390x. There are three minor issues, called out via FIXMEs below, which should be addressed in the future, but should not be blockers to getting this patch merged. I've opened the following issues to track them: - Vector shift count semantics https://github.com/bytecodealliance/wasmtime/issues/4424 - is_included_in_clobbers vs. link register https://github.com/bytecodealliance/wasmtime/issues/4425 - gen_constant callback https://github.com/bytecodealliance/wasmtime/issues/4426 All tests, including all newly enabled SIMD tests, pass on both z14 and z15 architectures.	2022-07-18 14:00:48 -07:00
Damian Heaton	d792646677	Implement `iabs` in ISLE (AArch64) (#4399 ) * Implement `iabs` in ISLE (AArch64) Converts the existing implementation of `iabs` for AArch64 into ISLE, and fixes support for `iabs` on scalar values. Copyright (c) 2022 Arm Limited. * Improve scalar `iabs` implementation. Also introduces `CSNeg` instruction. Copyright (c) 2022 Arm Limited	2022-07-18 11:12:34 -07:00
Damian Heaton	db7f9ccd2b	Convert `scalar_to_vector` to ISLE (AArch64) (#4401 ) * Convert `scalar_to_vector` to ISLE (AArch64) Converted the exisiting implementation of `scalar_to_vector` for AArch64 to ISLE. Copyright (c) 2022 Arm Limited * Add support for floats and fix FpuExtend - Added rules to cover `f32 -> f32x4` and `f64 -> f64x2` for `scalar_to_vector` - Added tests for `scalar_to_vector` on floats. - Corrected an invalid instruction emitted by `FpuExtend` on 64-bit values. Copyright (c) 2022 Arm Limited	2022-07-18 11:11:54 -07:00
Afonso Bordado	eca0a73453	cranelift: Use requested ISA Flags in run tests (#4450 )	2022-07-15 12:09:07 -07:00
Afonso Bordado	80976b6fc7	cranelift: Add `fadd`/`fsub`/`fmul`/`fdiv` to interpreter (#4446 ) Fuzzgen found these as soon as I added float support	2022-07-14 21:53:03 +00:00
Afonso Bordado	4ea46c3ca8	cranelift: Implement `table_addr` in interpreter (#4433 )	2022-07-13 12:53:42 -07:00
Damian Heaton	6c70428735	Convert `isplit` / `iconcat` to ISLE (AArch64) (#4402 ) Converted the existing implementations for `isplit` and `iconcat` for AArch64 to ISLE. Copyright (c) 2022 Arm Limited	2022-07-08 17:12:42 -07:00
Afonso Bordado	16cb287c53	cranelift: Use `round_ties_even` for `nearest` in interpreter (#4413 ) As @MaxGraey pointed out (thanks!) in #4397, `round` has different behavior from `nearest`. And it looks like the native rust implementation is still pending stabilization. Right now we duplicate the wasmtime implementation, merged in #2171. However, we definitely should switch to the rust native version when it is available.	2022-07-07 16:36:43 -07:00
Sam Parker	9c43749dfe	[RFC] Dynamic Vector Support (#4200 ) Introduce a new concept in the IR that allows a producer to create dynamic vector types. An IR function can now contain global value(s) that represent a dynamic scaling factor, for a given fixed-width vector type. A dynamic type is then created by 'multiplying' the corresponding global value with a fixed-width type. These new types can be used just like the existing types and the type system has a set of hard-coded dynamic types, such as I32X4XN, which the user defined types map onto. The dynamic types are also used explicitly to create dynamic stack slots, which have no set size like their existing counterparts. New IR instructions are added to access these new stack entities. Currently, during codegen, the dynamic scaling factor has to be lowered to a constant so the dynamic slots do eventually have a compile-time known size, as do spill slots. The current lowering for aarch64 just targets Neon, using a dynamic scale of 1. Copyright (c) 2022, Arm Limited.	2022-07-07 12:54:39 -07:00
Afonso Bordado	e9727b9d4b	aarch64: Fix i128 `of`/`nof` implementations (#4403 ) @yuyang-ok reported via zulip that i128 overflow tests were: 1. different from the interpreter implementation 2. wrong on some of the test cases This fixes both the tests and the aarch64 implementation and adds the interpreter to the testsuite.	2022-07-07 11:00:58 -07:00
Afonso Bordado	f98076ae88	cranelift: Implement float rounding operations (#4397 ) Implements the following operations on the interpreter: * `ceil` * `floor` * `nearest` * `trunc`	2022-07-06 16:43:54 -07:00

... 3 4 5 6 7 ...

1387 Commits