wasmtime

Author	SHA1	Message	Date
Jamey Sharp	9856664f1f	Make DataValue, not Ieee32/64, respect IEEE754 (#4860 ) * cranelift-codegen: Remove all uses of DataValue This type is only used by the interpreter, cranelift-fuzzgen, and filetests. I haven't found another convenient crate for those to all depend on where this type can live instead, but this small refactor at least makes it obvious that code generation does not in any way depend on the implementation of this type. * Make DataValue, not Ieee32/64, respect IEEE754 This fixes #4857 by partially reverting #4849. It turns out that Ieee32 and Ieee64 need bitwise equality semantics so they can be used as hash-table keys. Moving the IEEE754 semantics up a layer to DataValue makes sense in conjunction with #4855, where we introduced a DataValue::bitwise_eq alternative implementation of equality for those cases where users of DataValue still want the bitwise equality semantics. * cranelift-interpreter: Use eq/ord from DataValue This fixes #4828, again, now that the comparison operators on DataValue have the right IEEE754 semantics. * Add regression test from issue #4857	2022-09-03 00:26:14 +00:00
Afonso Bordado	f30a7eb0c9	cranelift: Implement PartialEq on Ieee{32,64} (#4849 ) * cranelift: Add `fcmp` tests Some of these are disabled on aarch64 due to not being implemented yet. * cranelift: Implement float PartialEq for Ieee{32,64} (fixes #4828) Previously `PartialEq` was auto derived. This means that it was implemented in terms of PartialEq in a u32. This is not correct for floats because `NaN != NaN`. PartialOrd was manually implemented in `6d50099816`, but it seems like it was an oversight to leave PartialEq out until now. The test suite depends on the previous behaviour so we adjust it to keep comparing bits instead of floats. * cranelift: Disable `fcmp ord` tests on aarch64 * cranelift: Disable `fcmp ueq` tests on aarch64	2022-09-02 10:42:42 -07:00
Jamey Sharp	84ac24c23d	cranelift: Remove const_addr instruction (fixes #2398 ) (#4843 )	2022-09-01 21:57:37 +00:00
Afonso Bordado	3ce3eeb668	cranelift: Register all functions in test file for interpreter (#4817 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter * cranelift: Relax signature checking for bools and vectors	2022-08-30 15:45:21 -07:00
Chris Fallin	2b4b257834	Revert "cranelift: Register all functions in test file for interpreter (#4800 )" (#4810 ) This reverts commit `500a9f17be`.	2022-08-30 01:15:11 +00:00
Afonso Bordado	500a9f17be	cranelift: Register all functions in test file for interpreter (#4800 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter	2022-08-29 23:39:50 +00:00
Afonso Bordado	9a8bd5be02	cranelift: Add LibCalls to the interpreter (#4782 ) * cranelift: Add libcall handlers to interpreter * cranelift: Fuzz IshlI64 libcall * cranelift: Revert back to fuzzing udivi64 * cranelift: Use sdiv as a fuzz libcall * cranelift: Register Sdiv in fuzzgen * cranelift: Add multiple libcalls to fuzzer * cranelift: Register a single libcall handler * cranelift: Simplify args checking in interpreter * cranelift: Remove unused LibCalls * cranelift: Cleanup interpreter libcall types * cranelift: Fix Interpreter Docs	2022-08-29 13:36:33 -07:00
Damian Heaton	94bcbe8446	Port `Fcopysign`..`FcvtToSintSat` to ISLE (AArch64) (#4753 ) * Port `Fcopysign`..``FcvtToSintSat` to ISLE (AArch64) Ported the existing implementations of the following opcodes to ISLE on AArch64: - `Fcopysign` - Also introduced missing support for `fcopysign` on vector values, as per the docs. - This introduces the vector encoding for the `SLI` machine instruction. - `FcvtToUint` - `FcvtToSint` - `FcvtFromUint` - `FcvtFromSint` - `FcvtToUintSat` - `FcvtToSintSat` Copyright (c) 2022 Arm Limited * Document helpers and abstract conversion checks	2022-08-24 10:37:14 -07:00
Damian Heaton	da1fb305a3	Port `vconst` to ISLE (AArch64) (#4750 ) * Port `vconst` to ISLE (AArch64) Ported the existing implementation of `vconst` to ISLE for AArch64, and added support for 64-bit vector constants. Also introduced 64-bit `vconst` support to the interpreter. Copyright (c) 2022 Arm Limited * Replace if-chains with match statements Copyright (c) 2022 Arm Limited	2022-08-23 09:40:11 -07:00
Benjamin Bouvier	8a9b1a9025	Implement an incremental compilation cache for Cranelift (#4551 ) This is the implementation of https://github.com/bytecodealliance/wasmtime/issues/4155, using the "inverted API" approach suggested by @cfallin (thanks!) in Cranelift, and trait object to provide a backend for an all-included experience in Wasmtime. After the suggestion of Chris, `Function` has been split into mostly two parts: - on the one hand, `FunctionStencil` contains all the fields required during compilation, and that act as a compilation cache key: if two function stencils are the same, then the result of their compilation (`CompiledCodeBase<Stencil>`) will be the same. This makes caching trivial, as the only thing to cache is the `FunctionStencil`. - on the other hand, `FunctionParameters` contain the... function parameters that are required to finalize the result of compilation into a `CompiledCode` (aka `CompiledCodeBase<Final>`) with proper final relocations etc., by applying fixups and so on. Most changes are here to accomodate those requirements, in particular that `FunctionStencil` should be `Hash`able to be used as a key in the cache: - most source locations are now relative to a base source location in the function, and as such they're encoded as `RelSourceLoc` in the `FunctionStencil`. This required changes so that there's no need to explicitly mark a `SourceLoc` as the base source location, it's automatically detected instead the first time a non-default `SourceLoc` is set. - user-defined external names in the `FunctionStencil` (aka before this patch `ExternalName::User { namespace, index }`) are now references into an external table of `UserExternalNameRef -> UserExternalName`, present in the `FunctionParameters`, and must be explicitly declared using `Function::declare_imported_user_function`. - some refactorings have been made for function names: - `ExternalName` was used as the type for a `Function`'s name; while it thus allowed `ExternalName::Libcall` in this place, this would have been quite confusing to use it there. Instead, a new enum `UserFuncName` is introduced for this name, that's either a user-defined function name (the above `UserExternalName`) or a test case name. - The future of `ExternalName` is likely to become a full reference into the `FunctionParameters`'s mapping, instead of being "either a handle for user-defined external names, or the thing itself for other variants". I'm running out of time to do this, and this is not trivial as it implies touching ISLE which I'm less familiar with. The cache computes a sha256 hash of the `FunctionStencil`, and uses this as the cache key. No equality check (using `PartialEq`) is performed in addition to the hash being the same, as we hope that this is sufficient data to avoid collisions. A basic fuzz target has been introduced that tries to do the bare minimum: - check that a function successfully compiled and cached will be also successfully reloaded from the cache, and returns the exact same function. - check that a trivial modification in the external mapping of `UserExternalNameRef -> UserExternalName` hits the cache, and that other modifications don't hit the cache. - This last check is less efficient and less likely to happen, so probably should be rethought a bit. Thanks to both @alexcrichton and @cfallin for your very useful feedback on Zulip. Some numbers show that for a large wasm module we're using internally, this is a 20% compile-time speedup, because so many `FunctionStencil`s are the same, even within a single module. For a group of modules that have a lot of code in common, we get hit rates up to 70% when they're used together. When a single function changes in a wasm module, every other function is reloaded; that's still slower than I expect (between 10% and 50% of the overall compile time), so there's likely room for improvement. Fixes #4155.	2022-08-12 16:47:43 +00:00
Andrew Brown	a83c50321f	cranelift: fix build warning (#4698 ) In #4375 we introduced a code pattern that appears as a warning when building the `cranelift-interpreter` crate: ``` warning: cannot borrow `state` as mutable because it is also borrowed as immutable --> cranelift/interpreter/src/step.rs:412:13 \| 47 \| let arg = \|index: usize\| -> Result<V, StepError> { \| -------------------------------------- immutable borrow occurs here 48 \| let value_ref = inst_context.args()[index]; 49 \| state \| ----- first borrow occurs due to use of `state` in closure ... 412 \| state.set_pinned_reg(arg(0)?); \| ^^^^^^^^^^^^^^^^^^^^^---^^^^^ \| \| \| \| \| immutable borrow later used here \| mutable borrow occurs here \| = note: `#[warn(mutable_borrow_reservation_conflict)]` on by default = warning: this borrowing pattern was not meant to be accepted, and may become a hard error in the future = note: for more information, see issue #59159 <https://github.com/rust-lang/rust/issues/59159> ``` This change fixes the warning.	2022-08-11 23:52:00 +00:00
Afonso Bordado	e4adc46e6d	cranelift: Fix shifts and implement rotates in interpreter (#4519 ) * cranelift: Fix shifts and implement rotates in interpreter * x64: Implement `rotl`/`rotr` for some small type combinations	2022-08-11 12:15:52 -07:00
Afonso Bordado	268ddf2f6c	cranelift: Implement pinned reg in interpreter (#4375 )	2022-08-10 21:33:45 +00:00
Damian Heaton	eb332b8369	Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) (#4608 ) * Convert `fma`, `valltrue` & `vanytrue` to ISLE (AArch64) Ported the existing implementations of the following opcodes to ISLE on AArch64: - `fma` - Introduced missing support for `fma` on vector values, as per the docs. - `valltrue` - `vanytrue` Also fixed `fcmp` on scalar values in the interpreter, and enabled interpreter tests in `simd-fma.clif`. This introduces the `FMLA` machine instruction. Copyright (c) 2022 Arm Limited * Add comments for `Fmla` and `Bsl` Copyright (c) 2022 Arm Limited	2022-08-05 09:47:56 -07:00
Nick Fitzgerald	42bba452a6	Cranelift: Add instructions for getting the current stack/frame/return pointers (#4573 ) * Cranelift: Add instructions for getting the current stack/frame pointers and return address This is the initial part of https://github.com/bytecodealliance/wasmtime/issues/4535 * x64: Remove `Amode::RbpOffset` and use `Amode::ImmReg` instead We just special case getting operands from `Amode`s now. * Fix s390x `get_return_address`; require `preserve_frame_pointers=true` * Assert that `Amode::ImmRegRegShift` doesn't use rbp/rsp * Handle non-allocatable registers in Amode::with_allocs * Use "stack" instead of "r15" on s390x * r14 is an allocatable register on s390x, so it shouldn't be used with `MovPReg`	2022-08-02 14:37:17 -07:00
Chris Fallin	8dddd6f1f7	Cranelift: Remove `ifcmp_sp` opcode. (#4578 ) This was temporarily added back in #3502 due to a need from Lucet; now that Lucet is EOL, the opcode is no longer needed and we can remove it.	2022-08-02 13:15:39 -07:00
Chris Fallin	43f1765272	Cranellift: remove Baldrdash support and related features. (#4571 ) * Cranellift: remove Baldrdash support and related features. As noted in Mozilla's bugzilla bug 1781425 [1], the SpiderMonkey team has recently determined that their current form of integration with Cranelift is too hard to maintain, and they have chosen to remove it from their codebase. If and when they decide to build updated support for Cranelift, they will adopt different approaches to several details of the integration. In the meantime, after discussion with the SpiderMonkey folks, they agree that it makes sense to remove the bits of Cranelift that exist to support the integration ("Baldrdash"), as they will not need them. Many of these bits are difficult-to-maintain special cases that are not actually tested in Cranelift proper: for example, the Baldrdash integration required Cranelift to emit function bodies without prologues/epilogues, and instead communicate very precise information about the expected frame size and layout, then stitched together something post-facto. This was brittle and caused a lot of incidental complexity ("fallthrough returns", the resulting special logic in block-ordering); this is just one example. As another example, one particular Baldrdash ABI variant processed stack args in reverse order, so our ABI code had to support both traversal orders. We had a number of other Baldrdash-specific settings as well that did various special things. This PR removes Baldrdash ABI support, the `fallthrough_return` instruction, and pulls some threads to remove now-unused bits as a result of those two, with the understanding that the SpiderMonkey folks will build new functionality as needed in the future and we can perhaps find cleaner abstractions to make it all work. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1781425 * Review feedback. * Fix (?) DWARF debug tests: add `--disable-cache` to wasmtime invocations. The debugger tests invoke `wasmtime` from within each test case under the control of a debugger (gdb or lldb). Some of these tests started to inexplicably fail in CI with unrelated changes, and the failures were only inconsistently reproducible locally. It seems to be cache related: if we disable cached compilation on the nested `wasmtime` invocations, the tests consistently pass. * Review feedback.	2022-08-02 19:37:56 +00:00
Sam Parker	37cd96beff	[AArch64] i64x2 support for min/max (#4575 ) Also added interpreter support for vector min/max. Copyright (c) 2022, Arm Limited.	2022-08-02 11:42:05 -07:00
Afonso Bordado	1f058a02c0	cranelift: Add MinGW `fma` regression tests (#4517 ) * cranelift: Add MinGW `fma` regression tests * cranelift: Fix FMA in interpreter * cranelift: Add separate `fma` test suite for the interpreter The interpreter can run `fma.clif` on most platforms, however on `x86_64-pc-windows-gnu` we use libm which has issues with some inputs. We should delete `fma-interpreter.clif` and enable the interpreter on the main `fma.clif` file once those are fixed.	2022-07-29 09:09:37 -05:00
Afonso Bordado	e121c209fc	cranelift: Fix `urem`/`srem` in interpreter (#4532 )	2022-07-27 10:47:08 -07:00
Damian Heaton	3ef89b7787	Allow 64-bit vectors and implement for interpreter (#4509 ) * Allow 64-bit vectors and implement for interpreter The AArch64 backend already supports 64-bit vectors; this simply allows instructions to make use of that. Implemented support for 64-bit vectors within the interpreter to allow interpret runtests to use them. Copyright (c) 2022 Arm Limited * Disable 64-bit SIMD `iaddpairwise` tests on s390x Copyright (c) 2022 Arm Limited	2022-07-25 13:00:43 -07:00
Afonso Bordado	446efd3e11	cranelift: Fix `icmp_imm` for small types in interpreter (#4506 )	2022-07-23 00:26:56 +00:00
Afonso Bordado	d89c262657	cranelift: Implement `{u,s}extend.i128` in interpreter (#4505 )	2022-07-22 10:47:10 -07:00
Afonso Bordado	80976b6fc7	cranelift: Add `fadd`/`fsub`/`fmul`/`fdiv` to interpreter (#4446 ) Fuzzgen found these as soon as I added float support	2022-07-14 21:53:03 +00:00
Afonso Bordado	4ea46c3ca8	cranelift: Implement `table_addr` in interpreter (#4433 )	2022-07-13 12:53:42 -07:00
Afonso Bordado	16cb287c53	cranelift: Use `round_ties_even` for `nearest` in interpreter (#4413 ) As @MaxGraey pointed out (thanks!) in #4397, `round` has different behavior from `nearest`. And it looks like the native rust implementation is still pending stabilization. Right now we duplicate the wasmtime implementation, merged in #2171. However, we definitely should switch to the rust native version when it is available.	2022-07-07 16:36:43 -07:00
Sam Parker	9c43749dfe	[RFC] Dynamic Vector Support (#4200 ) Introduce a new concept in the IR that allows a producer to create dynamic vector types. An IR function can now contain global value(s) that represent a dynamic scaling factor, for a given fixed-width vector type. A dynamic type is then created by 'multiplying' the corresponding global value with a fixed-width type. These new types can be used just like the existing types and the type system has a set of hard-coded dynamic types, such as I32X4XN, which the user defined types map onto. The dynamic types are also used explicitly to create dynamic stack slots, which have no set size like their existing counterparts. New IR instructions are added to access these new stack entities. Currently, during codegen, the dynamic scaling factor has to be lowered to a constant so the dynamic slots do eventually have a compile-time known size, as do spill slots. The current lowering for aarch64 just targets Neon, using a dynamic scale of 1. Copyright (c) 2022, Arm Limited.	2022-07-07 12:54:39 -07:00
Afonso Bordado	f98076ae88	cranelift: Implement float rounding operations (#4397 ) Implements the following operations on the interpreter: * `ceil` * `floor` * `nearest` * `trunc`	2022-07-06 16:43:54 -07:00
Afonso Bordado	9575ed4eb7	cranelift: Implement `global_value` in interpreter (#4396 )	2022-07-06 15:53:52 -07:00
Afonso Bordado	0f603dd2c5	cranelift: Implement `fmin_pseudo`/`fmax_pseudo` in interpreter (#4394 )	2022-07-06 14:54:29 -07:00
Afonso Bordado	925891245d	cranelift: Fix `fmin`/`fmax` when dealing with zeroes (#4373 ) `fmin`/`fmax` are defined as returning -0.0 as smaller than 0.0. This is not how the IEEE754 views these values and the interpreter was returning the wrong value in these operations since it was just using the standard IEEE754 comparisons. This also tries to preserve NaN information by avoiding passing NaN's through any operation that could canonicalize it.	2022-07-05 12:59:23 -07:00
Afonso Bordado	e91f493ff5	cranelift: Add heap support to the interpreter (#3302 ) * cranelift: Add heaps to interpreter * cranelift: Add RunTest Environment mechanism to test interpret * cranelift: Remove unused `MemoryError` * cranelift: Add docs for `State::resolve_global_value` * cranelift: Rename heap tests * cranelift: Refactor heap address resolution * Fix typos and clarify docs (thanks @cfallin)	2022-07-05 09:05:26 -07:00
Afonso Bordado	2003ae99a0	Implement `fma`/`fabs`/`fneg`/`fcopysign` on the interpreter (#4367 ) * cranelift: Implement `fma` on interpreter * cranelift: Implement `fabs` on interpreter * cranelift: Fix `fneg` implementation on interpreter `fneg` was implemented as `0 - x` which is not correct according to the standard since that operation makes no guarantees on what the output is when the input is `NaN`. However for `fneg` the output for `NaN` inputs is fully defined. * cranelift: Implement `fcopysign` on interpreter	2022-07-05 09:03:04 -07:00
Afonso Bordado	f2e6ff5e70	cranelift: Implement `sqrt` in interpreter (#4362 ) This ignores SIMD for now.	2022-07-01 09:39:11 -07:00
Afonso Bordado	23ae9016af	cranelift: Implement scalar `ireduce` on interpreter (#4320 )	2022-06-27 11:00:37 -07:00
Afonso Bordado	87007c5839	cranelift: Fix `bint` implementation on interpreter (#4299 ) * cranelift: Fix `bint` implementation on interpreter The interpreter was returning -1 instead of 1 for positive values. This also extends the bint test suite to cover all types. * cranelift: Restrict `bint` to scalar values only	2022-06-23 13:43:35 -07:00
Andrew Brown	bd6fe11ca9	cranelift: remove `load_complex` and `store_complex` (#3976 ) This change removes all variants of `load_complex` and `store_complex` from Cranelift; this is a breaking change to the instructions exposed by CLIF. The complete list of instructions removed is: `load_complex`, `store_complex`, `uload8_complex`, `sload8_complex`, `istore8_complex`, `sload8_complex`, `uload16_complex`, `sload16_complex`, `istore16_complex`, `uload32_complex`, `sload32_complex`, `istore32_complex`, `uload8x8_complex`, `sload8x8_complex`, `sload16x4_complex`, `uload16x4_complex`, `uload32x2_complex`, `sload32x2_complex`. The rationale for this removal is that the Cranelift backend now has the ability to pattern-match multiple upstream additions in order to calculate the address to access. Previously, this was not possible so the `*_complex` instructions were needed. Over time, these instructions have fallen out of use in this repository, making the additional overhead of maintaining them a chore.	2022-03-31 10:05:10 -07:00
Damian Heaton	6c8c94723a	Scalar values in `vectorizelanes` & `extractlanes` (#3922 ) - `extractlanes` will now function on a scalar value, returning the value as a single-element array. - `vectorizelanes` will accept a single-element array, returning the contained value. Existing `if !x.is_vector()` code-patterns have been simplified as a result. Copyright (c) 2022 Arm Limited	2022-03-28 09:32:59 -07:00
Chris Fallin	5e96a447f0	Add back the `ifcmp_sp` CLIF opcode. This opcode was removed as part of the old-backend cleanup in #3446. While this opcode will definitely go away eventually, it is unfortunately still used today in Lucet (as we just discovered while working to upgrade Lucet's pinned Cranelift version). Lucet is deprecated and slated to eventually be completely sunset in favor of Wasmtime; but until that happens, we need to keep this opcode.	2021-11-01 13:34:31 -07:00
bjorn3	86d2ef8952	Fix CI	2021-11-01 18:19:59 +01:00
bjorn3	a05bf2bf42	Remove instructions necessary for the old regalloc	2021-10-12 14:37:36 +02:00
bjorn3	1fd491dadd	Remove fallthrough instruction	2021-10-12 14:22:07 +02:00
bjorn3	5b24e117ee	Remove instructions used by old br_table legalization	2021-10-12 14:18:52 +02:00
bjorn3	8a8797b911	Remove the sarg_t type and dummy_sarg_t instruction They are no longer necessary with the new style backends	2021-10-10 14:38:35 +02:00
Benjamin Bouvier	43a86f14d5	Remove more old backend ISA concepts (#3402 ) This also paves the way for unifying TargetIsa and MachBackend, since now they map one to one. In theory the two traits could be merged, which would be nice to limit the number of total concepts. Also they have quite different responsibilities, so it might be fine to keep them separate. Interestingly, this PR started as removing RegInfo from the TargetIsa trait since the adapter returned a dummy value there. From the fallout, noticed that all Display implementations didn't needed an ISA anymore (since these were only used to render ISA specific registers). Also the whole family of RegInfo / ValueLoc / RegUnit was exclusively used for the old backend, and these could be removed. Notably, some IR instructions needed to be removed, because they were using RegUnit too: this was the oddball of regfill / regmove / regspill / copy_special, which were IR instructions inserted by the old regalloc. Fare thee well!	2021-10-04 10:36:12 +02:00
bjorn3	9e34df33b9	Remove the old x86 backend	2021-09-29 16:13:46 +02:00
Chris Fallin	65fde3a86b	Merge pull request #3380 from dheaton-arm/implement-iabs Implement `Iabs` for the interpreter	2021-09-22 10:00:53 -07:00
Chris Fallin	b076c99af9	Merge pull request #3379 from dheaton-arm/implement-sqmulroundsat Implement `SqmulRoundSat` for interpreter	2021-09-22 09:59:13 -07:00
Chris Fallin	dd7310df04	Merge pull request #3361 from dheaton-arm/implement-vecops Implement `VhighBits` & `Vselect` for interpreter	2021-09-22 09:22:52 -07:00
dheaton-arm	cb30ecc7bc	Implement `Iabs` for the interpreter Implemented `Iabs` to return the absolute integer value with wrapping. Copyright (c) 2021, Arm Limited	2021-09-22 12:59:30 +01:00

1 2 3 4

184 Commits